Data analysis is interesting.
Data preparation is not.

Clay automates the three major steps of data preparation:

1) Get your data: accessing the format and location,

2) Know your data: understanding what is in your data, and

3) Transform your data: making it usable for analysis.

To gain access to Clay, sign up here.

Gain Access

Or read more below to see how you can use Clay to sculpt your raw data into enhanced, usable data.

What is Clay?

Clay is an intelligent data pipeline system. With Clay, you can stream data, converting it automatically between a wide range of formats, sources and destinations. Along with conversion, Clay utilizes deep learning algorithms to recognize and identify dates and addresses in your data. It then can transform your data by standardizing dates, segmenting addresses, filtering columns, and limiting data input to make your data ready for analysis efficiently and effectively.

How Does Clay Work?

Get Your Data

Clay allows you to access, translate, and transport your data from a variety of locations stored in a variety of formats.

For example, Clay allows you to access and read in a dataset in CSV located in Amazon S3 and have it be translated into and stored on a PostgreSQL database.

Formats

The following formats of data that are supported:

  • CSV (.csv)
  • JSON (.txt, .json)

Locations: Data Sources & Destinations

Your data can be located in any one of the following locations and have it be accessed and delivered by Clay as well. Read more below on the specifics of every data store that is supported.

typesourcetarget
Amazon S3
Socrata🚫 (read-only)
PostgreSQL
MongoDB
  • Amazon S3

    Access all your files hosted on Amazon S3 securely.

  • Socrata

    Socrata is the most trusted and widely used government data store. Load datasets directly hosted on OpenData by Socrata.

  • PostgreSQL

    Connect to your PostgreSQL database as a source or a destination

  • MongoDB

    Clay supports the transformation of MongoDB NoSQL database tables.

Know Your Data

Clay accesses Datalogue's proprietary, ontology-mapping, deep learning algorithms to find and classify the following categories in your data:

  • Dates
  • Addresses

Given an example dataset:

NAD
John Adams1600 pennsylvania avenueoctober 30, 1735

Will be classified as the following:

N(DTL) Classification - NA(DTL) Classification - AD(DTL) Classification - D
John Adamsfull_name*1600 pennsylvania avenueaddressoctober 30, 1735date

Clay offers only a limited classification of dates and addresses*. Learn more about the different ontology classifications including Pharma, Finance, and even custom provided by Datalogue by contacting us.

Transform Your Data

There are four different types of transformations on your data that Clay supports:

  • Standardize dates
  • Parse addresses
  • Filter columns
  • Limit data intake

1) Standardize Dates

Translate all dates into a standardized format (YYYY/MM/DD)

  • "October 30, 1735" ➡️ 1735/10/30
  • "4/16/2017" ➡️ 2017/04/16
  • "8 juillet 1994 ➡️ 1994/07/08

2) Parse Addresses

Segment out addresses into the following parts:

  • Street Address
  • City
  • State
  • Zip Code
  • Country
Original Address(DTL) Street Address(DTL) City(DTL) State(DTL) Zip Code
625 Avenue of the Americas, New York, NY 10011 USA625 Avenue of the AmericasNew YorkNY10011

3) Filter Columns

Select only the columns of interest in datasets for a more relevant output.

4) Limit Data

Limit the amount of data rows to intake from sources for faster processing.

Contact

If you have ideas or feedback feel free to reach out.

Release History

Alpha

Terms of Service

Privacy Policy

Ready to get started? Gain access to Clay, sign up here.

Gain Access

Learn more about extending Clay's capabilities for unlocking your enterprise's data at Datalogue.