Monday, October 13th, 2025
Magic Capture: Reducing Clinical Trial Data Entry Burden by 90%
At Harbor, we use a combination of old and new technologies to reduce clinical trial data entry burden by 90%. Most importantly, we do this without changing workflows or sacrificing compliance. The result is faster, more accurate data entry, reduced monitoring costs, and more time for sites to focus on taking care of their subjects.
While clinical trial data is increasingly becoming digitized, the reality is 73% of clinical sites are still relying on paper source documents to capture data. This means that site staff, usually clinical research coordinators (CRCs), must spend hours manually transcribing data from paper documents into the EDC. This process is not only time-consuming, it's also error-prone and a significant driver of burnout. As one CRC told us:
"A typical ratio is for every hour with a patient, it'll take maybe 45 minutes to get it entered into an EDC with average usability. With worse systems, it takes me just as long to enter as it does to collect the data..."
The challenges with data entry aren't limited to paper source documents. Even when source documents are electronic (e.g., EHRs, dedicated eSource providers), the data still has to be manually entered into the EDC. The EHR to EDC problem is a well-known issue in the industry, but most solutions require sites to spend time building custom connectors for each study, a time-consuming and expensive process that only becomes worthwhile for trials above a certain size. Similarly, for sites to adopt eSource providers, site staff and subjects must get comfortable with electronic data collection. Even then, today's eSource software still requries manual copying of electronic source data into the EDC.
How Harbor is different
When we built our "Magic Capture" feature into Harbor EDC, we wanted to eliminate the need for manual data entry without requiring sites or sponsors to change any of their existing workflows, whether they're using paper or electronic source documents.
For sites using paper source documents, CRCs simply collect all of the documents together, scan them, and upload them to the Harbor EDC platform. Our Magic Capture feature first passes through each page of the document, identifying which subject and eCRF the data belongs to. Data values are extracted from the document and matched to the appropriate eCRF field. As a result, when CRCs go into the EDC, they can simply review the extracted data, fix any values, and fill the CRF in a matter of seconds, not minutes or hours.
Because Magic Capture utilizes AI to extract data, it can handle a variety of document types. In short, if your source document can be turned into a PDF, our Magic Capture feature can handle it. This means that sites using EHRs or eSource as source documents can also use Magic Capture without any changes to their workflows. Because Harbor is fully HIPAA compliant, source documents can be uploaded to the platform without the need to redact any personal health information.1
Of course, we know that source documents can be messy. Our models are trained to handle a wide variety of documents, from simple forms to complicated tables and messy handwriting to scanned printouts. To make the review process more efficient, Magic Capture generates confidence scores for each extracted data value, guiding CRCs to data values that may need a second look 2 and ensuring high data accuracy through human oversight.
Finally, the entire Magic Capture workflow has also been designed to be compliant with 21 CFR Part 11 and Good Clinical Practice. The workflow allows CRCs to quickly and efficiently review the extracted data. Once the data has been reviewed, the CRF fields are automatically populated and saved. For the purposes of auditors, all actions are fully traceable and attributable to human users.




How do sites benefit?
In early testing, Harbor's Magic Capture reduces data entry time by about 90%. Recently, we provided the same standardized 37-page set of data-heavy source documents to four different CRCs and asked them to enter them into Harbor EDC completely by hand. On average, our testers took two hours to get all of the data entered with even the fastest user taking an hour and 25 minutes to get all of the data entered.3 When we asked the same users to enter the same data using Magic Capture, users finished in just 10-15 minutes with the same 100% data entry accuracy.
By getting data entered faster, site staff are freed up to focus on much higher-value activities. This could mean spending more time recruiting and enrolling new subjects or making sure existing subjects have the support they need to stay engaged in the study, ultimately helping studies complete on time or even ahead of schedule. It could also mean making sure study materials are organized, regulatory paperwork is in order, and queries are answered quickly, keeping the data as clean as possible.
How do sponsors benefit?
Near real-time study visibility
With the traditional "double data entry" process used with other EDC systems, getting data entered into the EDC takes an average of 7-10 days after the data is collected. We've talked to customers who have experienced even longer delays in data entry with delays of four weeks in some of the worst cases. At best, this results in reduced visiblity into how the study is progressing for sponsors, making it hard to track subject enrollment and estimate monitoring burden; at worst, important safety events or protocol deviations can go unnoticed for weeks.
Using Magic Capture, sites enter data into the EDC in an average of 1-2 days. This means that sponsors' data management teams can follow the progress of a study in near real-time, allowing sponsors to prioritize their oversight and monitoring efforts where it matters most and make more informed decisions more quickly.
Reducing monitoring cost
As a downstream effect, Magic Capture also significantly reduces monitoring costs for a study. Data monitoring takes up the majority of a Clinical Research Associate's (CRA) time. Data monitoring takes up nearly a third of the total cost related to clinical studies with source data verification (SDV) alone accounting for 25% of a trial's total cost. There are a few ways in which Magic Capture enables reduced monitoring cost.
To start, sites upload source documents into Harbor EDC in order to utilize Magic Capture. This means that all source documents end up digitized, even if they started on paper. As a result, data monitors can view source documents and compare the source data with EDC data directly in the Harbor EDC platform and easily and quickly perform source data verification without traveling to sites and without flipping through binders of paper or logging into multiple programs.
Additionally, Magic Capture provides bounding rectangles with confidence scores for each extracted data value, allowing data monitors to easily identify and prioritize their review efforts. With bounding rectangles, monitors don't have to go searching around the source documents to find the relevant data.
We're still learning
At Harbor, we're on a mission to make clinical trials easier, faster, and smarter. Our Magic Capture feature is completely original in the EDC market and it is now live for the first time. As we put the feature through its paces in real-world clinical trial settings, we will keep an eye on key metrics like data entry speed, query rate, and data monitoring costs for our early users, sharing the results in future blog posts and case studies. If you're intrigued and want to learn more about Harbor and Magic Capture now, please get in touch. If you're ready to see how Magic Capture can help you collect better data in your studies, book a demo.
Footnotes
-
Note that redaction of personal health information from the source document may still be necessary so that data monitors can perform data verification without access to PHI. ↩
-
Handwriting with text crossed out is a common use case that consistently gets flagged for human review. ↩
-
For anyone who has spent time at a clinical site, you'll also recognize that getting two hours of uninterrupted time to enter data is also highly improbable. ↩