Introduction to Document Store
A document collection is a NoSQL database that stores data in JSON format (JavaScript Object Notation). Unlike traditional Relational Database Management Systems, document databases do not require a schema or a pre-defined structure with fixed tables and attributes. This is why they are also known as “non-relational” databases.
A document is based on the concept of a “Key-Value” store. Every key has a corresponding value, different documents have unique keys which help with CRUD operations - Create, Read, Update, and Delete. No two documents can have common primary keys. Multiple documents gathered in one structure is known as a “document collection”.
Document Collection:
This section describes in detail the features available in the GDN console and how to work with your document collections:
Navigate to the Document collection through Collections > Document in the sidebar menu.
Click to open the image in a new tab.
You will now be able to see details about all the collections you have in your account. You can use the tools described below to work with your collections.
Click to open the image in a new tab.
- Filter collections:
The search bar can be used to filter collections based on their name in real-time. The results narrow down as you type. - Collection Type:
This filter has 5 options, click to select "All" or a single collection type. When selected they will only display the collections which have the corresponding type. Selecting “Document” will only display Document collections. - Documentation:
A shortcut to Macrometa’s collections documentation which describes all the collection types and data models in detail. More information is available here, collections essentials. - New Collection:
Click to create a new collection to store data on Macrometa GDN. - Collection Name:
A list of collections, filters can be applied to narrow down the search for a specific collection. - Data Model:
Lists type of all the collections. If a “type filter” has been applied then it will only show relevant collections based on that filter. - Stream Enabled:
Shows whether a collection stream is enabled (Yes) or disabled (No). - Distribution:
When a collection is created, it can be locally or globally distributed across Macrometa servers. This section displays the distribution type for your collections - Local or Global.
Individual Document Collection
Data
The 'Data' tab is the primary section for viewing individual collection information.
Click to open the image in a new tab.
- Results count: Select how many results per page to display.
Edit Documents: This allows the transfer of a document from one collection to another.
- Import/Export: Import allows the user to upload documents in a JSON or CSV file. Export allows the user to download the documents in the collection as a JSON or CSV file which can be used for data pre-processing, cleaning, analytics etc.
- Filter Documents: This feature can be used to perform a lookup within the document collection.
Enter the attribute on which to filter and select logical expressions/operators from the dropdown menu. Then enter the value to filter against. You can add multiple filters and sort by attribute.
Click to open the image in a new tab. - ➕ / Create: Create a new document inside the selected collection and add data in the form of attributes and values.
- ➖ / Delete: Removes the selected document from the collection.
- _key: The primary key value for the document, every document must have a unique key. This key can be set by the user or automatically generated.
- Content: The section which displays the document data in the following format:
{ “//attribute_1//* : *//value_1//*, “//attribute_2//* : *//value_2//*, …}
Indexes
Indexes allow fast access to documents, provided the indexed attribute(s) are used in a query. Learn more about Indexes: https://macrometa.com/docs/documents/indexing/overview/
There are 4 types of Indexes that can be used with Document Collections:
- Geo Index
- Fulltext Index
- Persistent Index
- TTL (Time-to-Live) Index
Click to open the image in a new tab.
The indexes section shows active indexes for the selected collection. Following is the explanation of the elements of a collection index.
- ID: This contains an automatically generated unique number for each Key-Value collection, this is a primary key for the indexes - no two documents can have the same ID.
- Type: Indexes are used to speed up searches. There are various types of indexes, such as Persistent Indexes, TTL indexes, Search indexes, and Geo indexes.
- Unique: If the index is declared unique, then no two documents are allowed to have the same set of attribute values. The true option is the default value for primary keys/indexes while False is specified for all other keys/indexes.
- Sparse: If the index is declared sparse, a document will be excluded from the index and no uniqueness checks will be performed if any index attribute value is not set or has a value of null.
True: If a collection index has been declared sparse.
False: If a collection index has not been declared sparse. - Deduplicate: Controls whether inserting duplicate index values from the same document into a unique array index will lead to a unique constraint error or not.
True: This is the default value. Only a single instance of each non-unique index value will be inserted into the index per document.
False: Each instance of a non-unique index value will be inserted into the index per document. - Extras: Extra Conditions of the index definition. I.E. minimum length for FullText Index.
- Selectivity Est: An estimate indicating the percentage of documents affected by the indexed attribute(s).
- Fields: The attribute or attributes on which the index is created.
- Name: A custom name generated for a non-primary index. The primary index will be created and named during the creation of a collection.
- Action: Allows a user to delete or add indexes. Please note that the primary key is a unique identifier and cannot be deleted.
- Create ➕:
Allows users to create a new index by specifying its types and other required details, requirements vary based on the type of index selected.
Let us take a closer look at different types of indexes and the fields and requirements to be filled while creating them:
Collection Index creation example. Click to open the image in a new tab.
Geo Index
- Fields: Choose between one or two attribute paths, latitude and/or longitude, from the collection.
- Name(optional): The name of the index, if left blank, will be auto-generated.
- Geo JSON: Set to true if attributes are stored in arrays []. Otherwise, set it to false.
- Create in the background: If true, will create the index in the background.
Persistent Index:
- Type: Persistent Index: The index entries are written to disk when documents are stored or updated.
- Fields: Choose one or more attributes from the collection.
- Name(optional): The name of the index, if left blank, will be auto-generated.
- Unique: If true, will create a unique index.
- Sparse: If true, will create the sparse index.
- Deduplicate array values: If selected, duplicate index values from the same document into a
unique array index. - Create in the background: If true, will create the index in the background.
Fulltext Index:
- Type: Fulltext Index (used to find words or prefixes of words inside documents)
- Fields: Single attribute path.
- Name(optional): The name of the index, if left blank, will be auto-generated.
- Min. length(optional): The minimum character of words in the index.
- Create in the background: If true, will create an index in the background rather than lock the collection while the index is created. This allows for basic CRUD operations to occur while the index is created.
TTL (time to live) Index:
- Type: TTL (time-to-live) Index (automatically removing expired documents from a collection).
- Fields: Singe attribute value containing a numeric DateTime value.
- Name(optional): The name of the index, if left blank, will be auto-generated.
- Documents expire after (s): A number of seconds to be added to the timestamp attribute value of each document.
- Create in the background: If true, will create an index in the background rather than lock the collection while the index is created. This allows for basic CRUD operations to occur while the index is created.
Stream (If enabled)
Each collection in the GDN can be a stream. Collection streams use the WebSocket protocol to emit event messages for operations performed on the collection.
Learn more about Streams and Stream Processing through the GDN in Macrometa’s documentation.
Click to open the image in a new tab.
- Msg Rate In: Rate of data packets received per second.
- Msg Rate Out: Rate of data packets sent per second.
- Msg Throughput In/Out: Throughput or the amount of data passing through the pipeline per second.
- Average Msg Size: Size of an average data packet in KB
- Storage Size: Total storage size in KB
- Stream: Name of the enabled stream for collection (same as collection name).
- Replication: Local or Global, only local streams are selected for collection streams.
- Type: Stream type for collections enabled stream.
- Region: The region in which the collection was created.
- WebSocket URL: Weblink of the stream’s API.
- Latest Message: Click to update the stream to the latest messages.
Settings:
Click to open the image in a new tab.
- ID: Auto-generated ID of the selected collection.
- Type: Type of the selected collection.
- Status: Status of the selected collection.
- Delete: An option to completely delete the selected collection.
- Truncate: This option is to remove all the documents from the collection.
Comments
0 comments
Please sign in to leave a comment.