When you store output objects in a vector layer, ArcGIS Velocity manages the data according to a set of data retention policies. Data retention usually refers to the period of time that the data is actively maintained in the vector layer.

Purpose of data storage
When data storage is used, vector layers can be maintained at a given size, even as real-time data streams continuously add objects. This ensures that the underlying data set does not grow indefinitely, especially as older data becomes less relevant to understanding trends and viewing recent events.

Data warehousing is not intended to limit the available features to a specific time frame. Data warehousing ensures that data will be stored in the vector layer for at least the specified period. There may be data older than the specified period at any given time, as the data deletion process is performed periodically on a schedule. To ensure that the maps display a specific period of time for the data, it is best to query the data appropriately in client applications.

Data retention process
When you define an output vector layer in real-time or big data analytics, you can specify the data retention period to be applied to that vector layer. For example, you can store weather data only for the past day, but keep a history of fleet or vehicle locations for six months. You can also export older data to a vector layer archive (cold storage), which you can access when you need to run analysis on historical data.

Data storage options for output vector layers
When a data retention period is set for a vector layer, objects older than the specified time period are routinely removed from the underlying dataset. If you export this data, these objects are exported to the vector layer archive (cold storage) before they are deleted. For data storage, the object’s age is determined based on a timestamp relating to when the data was created in the underlying dataset, which may or may not coincide with the object’s initial time. Data storage is performed based on the creation time in order to apply a consistent approach to all datasets, including those that may represent interval data or have no date or time information in the object records.

Data retention is only required when you are storing data that will increase in size over time. This is evaluated based on the Data Storage Method settings and how you store data between analytics runs.

Data storage options for output vector layers
For example, if you select Add new objects (not just save the last object) and select Save existing objects and schema, if the analytics is restarted, the input data will increase over time and a data retention period will need to be set.

However, if you select the Save Last Object option, you will only save the last observation of each track. The amount of this data can grow as new sensors are installed in your organization, but usually stabilizes at the maximum size. In this case, no data retention period is required, and you can select the No Cleanup option. Vector layers created with the No Purge option retain the data indefinitely.

If the vector layer needs a period of data storage, you can export older data to the vector layer archive (cold storage). If this option is enabled, data older than the storage period is exported to the Parquet data format for archiving, which is supported by Velocity. The data is archived for no more than one year after the date of export or until the total maximum size of the object archive is reached (whichever is less).

For example, if you choose a period of 1 year to store data and decide to export old data to the archive, Velocity supports data for up to two years. If you choose a period of one month for data storage and decide to export old data to an archive, Velocity supports your data up to one month and one year.

Data storage export options for output vector layers
Data exported to the archive does not appear in the vector layer. To work with objects exported to an archive, import them using the Vector Layer (archive) data source type into Big Data analytics.