Concurrent Data Access: Why Actian Zen Is a Better Data Store than a Flat File
Programmers often use flat files for data storage because they’re easy to use and easy to set up. But using a database like Zen is easy too, and it greatly simplifies both code and data management tasks.
For a single application or user, writing to and reading from files is usually simple. But what happens when you have multiple applications writing to a single storage location, or trying to read from and possibly update a single shared file? With standard file access, it’s up to the developer to handle issues of safe concurrent reading and writing.
Actian Zen makes it easy to support the high need for concurrent data access in multilevel distributed data environments such as large IoT deployments.
This page covers the topics listed at right.
Concurrency with Flat Files
What do we mean by concurrent data access? We mean there’s simultaneous access to a data file by more than one program. If the file is only being read and not being changed, there’s no problem. But if there are multiple readers and writers, all bets are off. When multiple write routines run at the same time, the outcome can vary as the write operations may interlace within each other. If you’re writing structured data to a file, the overlapping writes interrupt the structure of the data and the resulting file is corrupted. It’s common for data from various sources, such as IoT and mobile devices, to be saved in a shared data repository. Imagine a set of sensors spread throughout an environment. The data generated by all the sensors would need to be gathered to a single location for analysis. Overlapping writes could occur as the devices are transferring their data.
With flat files, there are a few ways of mitigating possible corruption. One is through file locking, with the exact mechanism and functions for file locking varying from one operating system to another. When a file is locked by one application, other attempts to read or modify the file are blocked or will fail. When a file is locked and accessible to only one program, other programs that have data to read or write must wait until the file is unlocked.
In the following example, a file is being opened for exclusive access. Attempts by other processes to open the file for exclusive access fail and the program must wait until the file is available before it successfully acquires its own lock. This code waits a random amount of time and tries until successful.
void write(vector<SensorReading> readings) { HANDLE hFile; do { hFile = CreateFile(L"sensorData.log", GENERIC_WRITE, 0, nullptr, OPEN_ALWAYS , FILE_ATTRIBUTE_NORMAL, nullptr); //start writing at the end of the file. SetFilePointer(hFile, 0, 0, FILE_END); //If the file couldn't be opened (something else has the exclusive lock) //wait for some random amount of time, but no more than 2 seconds, before //trying again if (hFile == INVALID_HANDLE_VALUE) Sleep(rand() % 2000); } while (hFile == INVALID_HANDLE_VALUE); for (int i = 0; i < readings.size(); ++i) { DWORD bytesWritten; WriteFile(hFile, reading[i], sizeof(SensorReading), &bytesWritten, 0); } CloseHandle(hFile); } vector<SensorReading> read() { SensorReading reading; vector<SensorReading> readingList; HANDLE hFile; do { hFile = CreateFile(L"sensorData.log", GENERIC_READ, 0, nullptr, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, nullptr); //If the file couldn't be opened (something else has the exclusive lock) //wait for some random amount of time, but no more than 2 seconds, before //trying again if (hFile == INVALID_HANDLE_VALUE) Sleep(rand() % 2000); } while (hFile == INVALID_HANDLE_VALUE); DWORD bytesRead; do { ReadFile(hFile, reading, sizeof(SensorReading), &bytesRead, nullptr); readingList.push_back(reading); } while (bytesRead); CloseHandle(hFile); return readingList; }
Once a program does gain access to the file, it will complete all of its operations before the next program gets access. So, each program successfully performs all of its writes without interference from other programs, but there’s a performance consequence: Since only one program at a time can access the file, more time is needed for all instances of the programs to complete their tasks. Plus, each application needs additional logic to handle waiting on access.
To avoid complete serialization of file access resulting from exclusive opens, a program could use a partial lock to grant exclusive access to a specific byte range. Other processes have access to the bytes outside of the locked range for reading, writing, or asserting other locks.
If multiple clients are accessing this file, there’s no problem as long as they’re not trying to access the same bytes. But if multiple clients try to access the same portion of the file, errors could occur and the program will need extra code to handle those situations.
Imagine multiple devices storing their sensor data to a common location. With sections of a file being locked, a possible outcome is that some sensor records are successfully stored while others fail because of sections of the file being locked. Code could be added to wait on a needed section of a file to become unlocked, but if two devices both have a lock on a section of data that the other device needs, they could end up deadlocked – each stuck waiting on the other to release the file lock.
As more clients are added, the likelihood of such events rises. In addition, if files are copied from one environment to another, the files must all use the same format and structure or you’ll need a tool to translate. Ensuring that the same file formats are used across products requires coordination among the developers. Differences in formats means a need for extract, transform, load (ETL) processes and additional overhead for data transfer.
Concurrent Data Access with Zen
Actian Zen orchestrates concurrent data access across different devices running different operating systems and architectures. It runs on PCs running Windows, MacOS, and Linux, and on IoT endpoints including the Raspberry Pi, Windows IoT Core, and iOS and Android mobile devices. For all these devices, the data is saved using the same file format. If data or even entire files need to be copied from one Zen instance to another, no ETL processing is needed.
Two devices can open the same file without special code. Even if the file is on a different machine, this can be done safely. Several devices could use the following code to write their sensor data to the same file without interfering with one another.
void WriteSensorData(vector<SensorReading> readings) { BtrieveClient btrieveClient; BtrieveFile btrieveFile; btrieveClient.FileOpen(&btrieveFile, FILE_NAME, NULL, Btrieve::OPEN_MODE_NORMAL); for (int i = 0; i < readings.size(); ++i) { btrieveFile.RecordCreate((char*)&readings[i], sizeof(SensorReading)); } btrieveClient.FileClose(&btrieveFile); btrieveClient.Reset(); }
The Zen engine coordinates access by all programs that have a handle to a given file, making sure that write operations are consistent and safely completed, with no extra code needed in the application.
When programs read records from a file, the data is never in an intermittent state. And because of Zen, it is automatic.
An application needing to perform updates can choose to retrieve records with a lock bias by simply adding a Lock_Mode to the RecordRetrieve calls. The application can lock a single record at a time, or multiple records. It can also specify whether the read+lock request should wait until the record is available to be locked, or return a status code (“record in use”) if the desired record is already locked by another user.
All of the lock coordination is handled by the Zen engine, and the application merely needs to indicate whether it wants to retrieve with or without locks. The following code sample shows an implementation of this feature:
vector<SensorReading> ReadSensorData() { BtrieveClient btrieveClient; BtrieveFile btrieveFile; Btrieve::StatusCode status; vector<SensorReading> readingList; SensorReading sensorReading; btrieveClient.FileOpen(&btrieveFile, SENSOR_FILE_NAME, NULL, Btrieve::OPEN_MODE_NORMAL); int bytesRead = btrieveFile.RecordRetrieveFirst( Btrieve::INDEX_NONE, (char*)&sensorReading, sizeof(SensorReading), Btrieve::LOCK_MODE_SINGLE_WAIT); status = btrieveFile.GetLastStatusCode(); while (status == Btrieve::STATUS_CODE_NO_ERROR) { readingList.push_back(sensorReading); btrieveFile.RecordRetrieveNext((char*)&sensorReading, sizeof(SensorReading), Btrieve::LOCK_MODE_SINGLE_WAIT); status = btrieveFile.GetLastStatusCode(); } btrieveClient.FileClose(&btrieveFile); btrieveClient.Reset(); return readingList; }
Zen also offers solutions to keep data in sync across devices. If multiple clients are performing insertions, there’s nothing special that needs to be done in code. Zen will automatically take care that the data is properly inserted.
It is sometimes necessary to perform a number of insert/update/delete operations as a unit, to make sure they are all completed successfully. Transactions can be used to encompass these operations, causing the Zen engine to guarantee that all or none of the operations are completed. The Zen engine handles multiple transactions that happen concurrently as long as they don’t affect the same rows.
A transaction will either completely succeed or fail as a single unit. If it fails, the operations make no change to the data files; there will be no partial results or inconsistent data.
In the following, a transaction is performed affecting two records. A sensor reading is updated to be flagged to indicate it has a measurement that may have been influenced by a maintenance event and a new event record for the maintenance is written to a log.
btrieveClient.TransactionBegin( Btrieve::TransactionMode::TRANSACTION_MODE_CONCURRENT_WRITE_WAIT); try { status = eventFile.RecordCreate((char*)&eventRecord, sizeof(Event)); if (status == Btrieve::STATUS_CODE_NO_ERROR) throw "Record could not be created"; int bytesRead = sensorFile.RecordRetrieve( Btrieve::COMPARISON_EQUAL, Btrieve::INDEX_1, (char*)sensorRecordKey, sizeof(sensorRecordKey), (char*)&sensorReading, sizeof(SensorReading) ); if (bytesRead == -1) throw "Record could not be retrieved"; sensorReading.maintenance = true; status = sensorFile.RecordUpdate((char*)&sensorReading, sizeof(SensorReading)); if (status == Btrieve::STATUS_CODE_NO_ERROR) throw "Record was not updated"; status = btrieveClient.TransactionEnd(); return true; } catch (const char * msg) { btrieveClient.TransactionAbort(); cerr << message; return false; }
Locking Data Access in Zen
You can apply a range of different types of locks to a Zen data file.
With an exclusive lock, an entire data file can be locked and available to only a single program or user. An application can also lock a single record or a set of records. If an application or process already has a lock on a record and another application requests a lock on the same record, the later request can be set to wait for the requested record to become available.
The program can choose to not wait at all and either immediately acquire the lock for the records, if they’re available, or fail immediately without waiting. The waiting behavior is up to the application. Most applications use no-wait locks and implement their own retry logic.
Furthermore, for many operations the locking doesn’t need to be explicit, but the Zen engine still protects data integrity. If two processes are updating the same record, the second process receives an error status for the operation, letting it know the data in the record had already changed between the time it was read and the time the write was attempted.
Wrapping Up
For most common scenarios, a Zen instance can support hundreds of concurrent users, whether the users are applications running on the same device or on peer devices connected through the network.
Coordination of concurrent data access in a multiplatform environment is a nontrivial problem. The Zen family of database products is a solution that works across a range of device types, operating systems, and development environments. Because it complies with ACID principles, use of Zen protects your data from corruption that could otherwise occur from connectivity failures or other issues that might result from concurrent data access.
You can try out many of the Zen products through the Electronic Software Distribution page. For more information on Zen solutions for edge and embedded devices, see the Actian Zen home page.