Skip to main content

Upload Process

The file upload process in LabTrace is a sophisticated multi-step operation that ensures file integrity, immutability, and verifiability through blockchain technology and distributed storage.

Process Overview

Step-by-Step Technical Process

1. File Validation & Processing

1

File Reception

The system receives the uploaded file through the web interface or API
2

Security Checks

File is scanned for malware and validated against allowed file types
3

Size Validation

File size is checked against project and user limits
4

Content Analysis

Basic file metadata is extracted (size, type, timestamps)

2. CID Hash Generation

The Content Identifier (CID) is generated using IPFS’s content-addressing system:
// Pseudocode for CID generation
const fileBuffer = readFileContent(uploadedFile);
const hash = await ipfs.add(fileBuffer, {
  onlyHash: true,  // Generate hash without storing
  cidVersion: 1,   // Use CIDv1 format
  hashAlg: 'sha2-256'  // SHA-256 hashing algorithm
});
const cidHash = hash.cid.toString();
Key Properties of CID Hash:
  • Deterministic: Same content always produces same CID
  • Unique: Different content produces different CIDs
  • Verifiable: Anyone can verify file integrity using the CID
  • Immutable: Content cannot be changed without changing the CID

3. JSON file Creation

A JSON file object is created with different content based on file type: For Public Files:
{
  "FileHash": "QmX1234567890abcdef...",
  "FileLinkS3": "https://bucket.s3.amazonaws.com/filename.pdf"
}
For Private Files:
{
  "FileHash": "QmX1234567890abcdef..."
}
Important: Private files do NOT include FileLinkS3 in their metadata because they are not stored in AWS S3 - only their metadata is stored on IPFS.
Key Differences:
  • Public Files: Stored in AWS S3 + metadata on IPFS
  • Private Files: Stored on IPFS only + metadata on IPFS
  • Both Types: JSON file is stored on IPFS and operations recorded on blockchain

4. Storage Process

Public Files

  • File Content: Stored in AWS S3 (publicly accessible)
  • JSON file: Stored on IPFS
  • Contains: FileHash + FileLinkS3

Private Files

  • File Content: Stored nowhere
  • JSON file: Stored on IPFS
  • Contains: FileHash

Pinning Service

JSON file files are pinned to ensure persistence in the IPFS network

Replication

IPFS content is replicated across multiple nodes for redundancy
Storage Benefits:
  • Content Addressing: Files are accessed by their hash, not location
  • Deduplication: Identical files share the same storage space
  • Hybrid Storage: Public files get S3 performance + IPFS verification
  • Privacy Control: Private files remain fully decentralized
  • Permanent: Pinned metadata remains accessible indefinitely

5. Blockchain Transaction Recording

Blockchain Record Contains:
  • Operation Type: file_upload, file_delete, etc.
  • JSON file CID: Content identifier for the JSON file
  • Project Association: Which project the file belongs to according to the project smart contract asset
  • User Information: The user address that performed the operation
  • Timestamp: When the operation occurred

6. Database Updates

  • File metadata stored in PostgreSQL database
  • Relationship to projects and users established
  • Search indexes created for efficient queries
  • Access permissions configured
  • Transaction hash stored for verification
  • Block number and timestamp recorded
  • Smart contract interaction details saved
  • Verification status tracked
  • Complete operation history logged
  • User actions tracked with timestamps
  • System events recorded for debugging
  • Performance metrics collected

Primary and Secondary Files

Primary files are the main files that are uploaded to the project. Secondary files are files that are linked to a primary file, or many primary files, in the same project. In addition, for secondary files, the user has to provide a link to the primary file or a description of the procedure. This link from a secondary file** **to a primary file is also added into the note field of the blockchain transaction.

Error Handling & Recovery

Upload Failures

Robust retry mechanisms and partial upload recovery

IPFS Issues

Automatic failover to backup IPFS nodes

Blockchain Delays

Graceful handling of network congestion

Storage Redundancy

Multiple storage layers ensure data safety