Loading…
RMACC HPC Symposium has ended

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Data Management [clear filter]
Wednesday, August 13
 

8:30am

Data Management in Linux

Once you understand the basics of Linux, it's time to learn how to create, remove, and manipulate files and the data within them. We'll cover a range of filesystem-related commands, plus explore utilities for pattern matching, editing, searching, and more.

If you would like to follow along with the examples, please bring a laptop that a) runs Linux or Mac OSX, or b) allows you to log in to a Linux server using ssh.

Peter Ruprecht from CU's Research Computing will again be giving this tutorial.


Speakers

Wednesday August 13, 2014 8:30am - 10:00am
Wolf 304

10:30am

Best Practices for Data Management

Given recent initiatives from funding agencies and a push to move academic research to be more openly accessible, managing research data has become a critical part of the research process.  This tutorial will discuss how to adequately manage your data to ensure optimum visibility for you and your project, but also how to be more competitive when applying for research grants.  Topics will include:  data storage, metadata, writing a successful data management plan, accessibility, and ways to use data to promote your research.



Wednesday August 13, 2014 10:30am - 12:00pm
Wolf 304

1:00pm

Introduction to Hadoop, HDFS and Data Analysis with Pig

This workshop will give an overview about Hadoop, an open source software framework for large scale data processing and the Hadoop Distributed File System (HDFS).  Pig, a high-level data processing language will be used to perform data analysis exercises. Please bring your own laptop; a virtual machine with a single-node Hadoop installation will be provided. 


Speakers

Wednesday August 13, 2014 1:00pm - 2:30pm
Wolf 304

3:00pm

Globus for Research Data Management

The goal of the tutorial is to introduce researchers and systems administrators to the easy-to-use Globus services for moving, sharing, and publishing large amounts of data. Increasingly computational- and data-intensive science makes data movement and sharing across organizations inevitable. The cloud-hosted Globus service offers dropbox-like simplicity for big data.

In this tutorial, attendees will learn how to perform fire-and-forget file transfer, sharing, and synchronization between their local machine, campus clusters, regional supercomputers and national cyberinfrastructure using Globus, via both Web and command line interfaces.

Tutorial attendees will also learn how to install Globus Connect Server on their campus cluster to provide data transfer endpoints to their users. The tutorial will include instruction on using Globus via the CLI, using scripts for controlling Globus operations; and how to use the Globus transfer REST API, for programmatic interaction with Globus. By the end of the tutorial, participants will have the tools and information required to provide their users with Globus’s full range of benefits. Attendees will also get a preview of new Globus data publication and discovery functionality that will be delivered later this year.


Speakers

Wednesday August 13, 2014 3:00pm - 4:30pm
Wolf 304