Experimental Data Redundancy

Home >> Connect Multiple DAQ devices for the same Experimental Setup / Project >> Experimental Data Redundancy

Change Language
Last update: 15-11-2023

Experimental Data Redundancy during an experiment is made on every dataset uploaded to a data repository, for instance, a dataverse, where previously uploaded dataset values are reuploaded on each subsequent dataset with newer data. On the repository side, as an experiment advances, the number of dataset files available increases, with overlapping measured sensor data values between consecutive files.

The last dataset upload holds all data values with a unique fingerprint ID and is linked together in a blockchain-like kind of logic.

It is important to mention here, the importance of linear timed dataset uploads as a way to assist in data acquisition validation of experimental data origins. Therefore is important to happen:

TRUE randomized dataset uploads to a data repository, by the smart DAQ setup in any experiment and during the course of an experimental campaign
TRUE randomized dataset uploads requests made remotely by the data repository.

During the day, at work hours, and during the night when all is dark.

Smart DAQ Autonomous Cooperation

datasets record history

One possible way to improve the trustworthiness of experimental data collected from local sensors is to use redundant datasets. Each smart DAQ connected in a swarm-like manner will hold in real-time a copy of all other datasets on all devices it connects. Updated and verified on each new data measurement made and replicated across the swarm. This way, in case of external interference, it is possible to identify if is one of continuous nature or instead, it happened intermittently during the experimental campaign, by identifying mismatched data values between the current dataset and older datasets previously received from all other devices. Each smart DAQ has the task of comparing previous existing datasets from another device with newer ones received from the same device. While at the same time verifying changes on the experimental data blockchain hashes.

Another way to improve the trustworthiness of the experimental data collected is by uploading to the data repository all copies of a dataset including the one where data was collected from all smart DAQ devices connected. In this case, the data repository needs to verify incoming datasets by comparing previous and existing ones in the data repository with the newer just arrived. The data repository must be able to do verification of changes on the experimental data blockchain hashes among all dataset files previously downloaded and notify back to a smart DAQ when a mismatch is found.

Finally, is also possible to implement MD5 checksum, or other authenticity file verification algorithm, to each dataset file saved over time during an ongoing experiment. This allows the linkage of the previous dataset file to the next, in all similar to a blockchain, meaning any attempts to change a dataset are easily identified when performing experimental data validation tasks.

DAQ Cooperation in the same batch and the same experiment

Is also possible to set up many Smart DAQs to collaborate together autonomously. For instance, in an experimental setup, with the same batch of specimens to be tested simultaneously, each with its own smart DAQ collecting sensor data, DAQs can be configured to exchange experimental data among each, and do:

synchronization of collected data among each other, for instance, using the same Primary key as a common index for the different databases holding experimental data
Link individual measurements on each individual specimen, together, by sharing individual fingerprint data IDs with each other stored redundantly, and with the same index key across the different local databases.

This blockchain-like logic includes not only the fingerprint ID of the previous data measurement but also all fingerprint IDs generated on all other specimens in the same experimental setup. An example, let's say a researcher has set up 3 specimens for a round of testing, each scheduled sensor data measured, on each of the 3 specimens, will include the sensor data itself with its unique fingerprint ID, and also all fingerprint IDs from the other 2 specimens from the current time-indexed measurement and also from the previous measurement. In total, for this particular example, with only 3 specimens, any sensor data, at any given time, will hold 6 Unique fingerprint IDs on every individual "experimental data block". For the case of a setup with 5 specimens, each sensor data block will hold 10 Unique fingerprint IDs.

This way recording experimental data have advantages over the current experimental data acquisition setup. The ability to upload experimental data in real-time, live, to a remote data repository, and assists remote researcher teams to do experimental data validation with the lowest latency only limited by technology constraints and network bandwidth usage at any given time (Time here is to be perceived in milliseconds). So any data forgery requires direct access when an experiential dataset is being uploaded, and also able to understand it and change it with the intent of changing any data values other than the factually measured and intended for the experimental setup.

DAQ Cooperation in the same batch and different experiments

Is also possible to link experimental data blocks collected from the same batch of specimens across different experimental setups:

Setup to start on the same date and time
Setup to start at different dates and times and with the same measurement interval

For such cases, the recording of individual "experimental data blocks" will include a time-dependent index key, linking individual sensor data measurements, and also include all fingerprint IDs for the corresponding index key from each specimen across the different experimental setups. For a better understanding of what is previously said, consider 2 different experiments with the same batch of 5 specimens each. For this particular setup, each sensor data measurement is stored on its individual "data block" with its individual unique fingerprint ID will also include all unique fingerprint IDs from all specimens across the 2 experiments, a total o 10 fingerprint IDs.

This kind of data validation logic, using data redundancy and time-aware experimental data measurements, allows a researcher to present dataset files where sensor data is linked to any and all other experiments previously set up and linked to do data measurements. This increases the trustworthiness of data stored in a public data repository while at the same allows remote researchers to program and automate experimental data linkage. For instance in an Excel workbook or in a Python script.

DAQ Cooperation with different batches in the same project

It is also possible to link experimental data blocks collected from different batches of specimens across different experimental setups. In all similar, if not equal, to what was described previously.

DAQ Cooperation with different research projects

Finally, this smart DAQ, allows the linkage of individual sensor data measurements across different research projects, whether happening in the same physical laboratory or at different laboratory locations. The same principles previously stated are applied. Individual Smart DAQs are set up and configured to exchange data autonomously, with the purpose of generating, storing, managing, and exchanging unique data fingerprint IDs from the different experimental data sources.

Verification of the blockchain integrity

Verification of the blockchain integrity is made in parallel during an experimental campaign and using separate computing resources. It can be made on the Laptop/computer of the scientific researcher and all members of a research team. As well as it can happen to the editorial staff of a journal when submitting any communication related to all existing datasets in a data repository.

Legend used in all AeonLabs repositories

🟢 Fully tested and working
A green circle means the hardware electronics or the programming code was fully tested, each of its functionalities and capabilities. And it can be installed in a vehicle. Keep in mind this does not mean errors won't happen. As in everything related to electronics and software, there are revisions and updates. This open hardware is no different.
    💯 Fully tested & working, no improvements necessary - already being sold online
    🆓 Fully Open hardware \ source code
    🤪 There's better than this. don't use it
    🔐 Fully closed hardware \ source code
    ⚠️ it can be improved.
    ⚡️ fully tested and working, however, it is a dangerous solution to deploy

🟡 Not tested. Working capability is unknown, it may work or not.
A yellow circle means the hardware electronics or the programming code was not fully tested, each of its functionalities and capabilities. This does not mean it not working, it simply means testing is needed before giving a green circle of approval.

🔴 Fully tested but not working.
A red circle means the hardware electronics or the programming code was fully tested, and found some kind of critical error or fault. This means the electronics or firmware code cannot be used in a vehicle.

⌛ Not started.
The hourglass means the hardware electronics or the programming hasn't started. Most likely because is waiting for the necessary test components needed for reverse engineering and also engineering of the new open solution.

🆕 New updated contents
The new icon means the link next to it was recently updated with new contents

💬 Comments on the Discussion page
The comments icon means there are useful and even new comments on the discussions page of the repository important for what you are seeing or reading.

Join the Beta Program

Join the beta program to test and debug to provide feedback, ideas, modifications, suggestions, and improvements. And in return, write your own article blog or post on social media about it. See participation conditions on the Wiki.

Beta Participation Agreement

The Beta Participant Agreement is a legal document being executed between you and AeonLabs that outlines the conditions when participating in the Beta Program.

Contributing

Bug reports and pull requests are welcome on any of AeonLabs repositories. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.

Contributing
- Your Contributions Matter
- Contributor Code of Conduct

Coding standards

Please make sure tests pass before committing, and add new tests for new additions.

Licensing
- AeonLabs Solutions for Open Hardware & Source Development

Author

You can get in touch with me on my LinkedIn Profile:

Miguel Tomas

You can also follow my GitHub Profile to stay updated about my latest projects:

Be supportive of my dedication and work towards technology education and buy me a cup of coffee

The PCB design Files I provide here for anyone to use are free. If you like this Smart Device or use it, please consider buying me a cup of coffee, a slice of pizza or a book to help me study, eat and think new PCB design files.

Make a donation on PayPal

Make a donation on PayPal and get a TAX refund*.

Support all these open hardware projects and become a GitHub sponsor

Liked any of my PCB KiCad Designs? Help and Support my open work to all by becoming a GitHub sponsor.

License

Before proceeding to download any of AeonLabs software solutions for open-source development and/or PCB hardware electronics development make sure you are choosing the right license for your project. See AeonLabs Solutions for Open Hardware & Source Development for more information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly