Skip to Main Content

Research Data Management

Preservation

 

Repositories

  • The best way to preserve your data is to deposit your data into a repository.
  • Many funding agencies also require you to deposit your data into a repository.
  • Some publishers require that data be publicly available prior to acceptance and publication.

Digital Commons is Connecticut College's Institutional Repository. 
 

Best practices for files

 

When selecting file formats for archiving, the formats should ideally be:

  • Non-proprietary: .docx, .xlsx, and other file formats used by Microsoft Word are proprietary and non-preferred for this reason

  • Unencrypted
  • Uncompressed
  • In common usage by the research community
  • Adherent to an open, documented standard, such as described by the State of California (see AB 1668, 2007)
    • Interoperable among diverse platforms and applications
    • Fully published and available royalty-free
    • Fully and independently implementable by multiple software providers on multiple platforms without any intellectual property restrictions for necessary technology
    • Developed and maintained by an open standards organization with a well-defined inclusive process for evolution of the standard.

Converting files to new formats

  • Note conversion steps taken
  • If possible, keep the original file as well as the converted one 

When stating what format you'll be using:

  • Note what software is needed to view the file
  • Provide information about version control
  • Explain any anticipated format changes - such as using one file type for collection and another for analysis.
Preferred file formats

 

Compressed:

  • TAR
  • GZIP
  • ZIP

Databases:

  • XML
  • CSV

Text:

  • RTF
  • ODT
  • XML
  • HTML
  • ASCII
  • UTF-8

Still images:

  • TIFF
  • JPEG 2000
  • PDF
  • PNG
  • GIF
  • BMP

Geospatial:

  • SHP
  • DBF
  • GeoTIFF
  • NetCDF

Moving images:

  • MOV
  • MPEG
  • AVI
  • MXF

Spectra:

  • JCAMP

Sounds:

  • WAV
  • AIFF
  • MP3
  • MXF

Statistics:

  • CSV
  • ASCII
  • DTA
  • POR
  • SAS
  • SAV

Web archive:

  • WARC