Technology for Research: Big Data

  • Faculty & Staff

Big data is a term for research involving datasets that exceed one terabyte in size. At SAS, many big data projects exceed the capacity of publicly accessible resources, making on-premises or cloud-hosted private compute clusters the standard solution. SAS Computing IT plays many roles with these systems from full management to hardware reliability monitoring. 

To get started with your own research compute cluster, please contact your local support provider or .

Some hosted options are also available to the SAS community:

  • General Purpose Cluster (GPC): Compute cycles and storage can be purchased above standard allocations for larger datasets.
  • XSEDE (Extreme Science and Engineering Discovery Environment): Cloud-based research cluster supported by the National Science Foundation. Suitable for very large data sets.

Here are some links containing specific examples from the Penn community:

Also see Storage and Backup and High Performance Computing for more information.

For more information or to talk about your specific needs, please contact your Local Support Providers.