For data scientists, data cleansing is the 80/20 rule gone wrong. Data engineering - including preparation, intake, cleansing, and refinement - can take up a majority of an analyst's time, without any guarantees about the final data quality.
Qudol tackles these issues by automating intake, cleansing, and time-series linking. Qudol Security Master links multiple records, time periods, and vendor identifiers as a single security, and assigns it a linking identifier (QUID). All data, history, and logic are linked through the QUID, allowing for powerful, historical analytics.
By algorithmically adding a time dimension to every single vendor dataset, Qudol simply and efficiently reduces the quality and accuracy problem to one of timing.
I've onboarded hundreds of years of historical data for our clients. Our Security Master product algorithmically and automatically links assets across all of your data sources (vendor and proprietary), without any limits on depth of history. Our ability to continuously improve historical links with ongoing intake - meaning that we ingest, refine, master, and deliver current data as they become available - is key to delivering the most accurate security master - and one that keeps getting better.
A question I often get is: How do I know that your security master is more "accurate"?
The short answer is that Qudol's platform is built on decades of deep experience with financial services vendor data, combined with a recognition that a subset of scenarios always requires expert disambiguation. Here's how it works:
Full lineage makes it easier for the expert to research the issue for a fix.
The following are a few examples of where I think Qudol really stands out in its capabilities for improving accuracy.
Multiple vendor assets mapped to the same QUID is the expected outcome from Qudol's deduplication and matching algorithm. But what about multiple assets from the same source? Is it two different parts of the asset history, or is it an incorrect overlap that needs to be rectified?
Typically, we see something like the example below. It appears that this vendor, for whatever reason, introduced a USA version in 2017 while the CAN version (from 1994) was still active (represented here as 2999-12-31).
QUID* | Start | End | AssetCode | Name | Lineage |
---|---|---|---|---|---|
134231971 | 1994-04-30 | 2999-12-31 | CAN00CP0 | PENN WEST PETROLEUM LTD | CUSIP.MSCI-76044 |
134231971 | 2017-01-06 | 2999-12-31 | USA0BQH0 | PENN WEST PETROLEUM LTD | CUSIP.MSCI-76044 |
* Identifiers have been obfuscated for compliance. Names and dates are fictitious, but meaningful.
Qudol identified this issue, and generated an alert for expert review. The expert determined that the CAN asset should have been terminated in 2017. A correction was applied at intake and quickly propagates across the entire system.
Here's another common data deficiency that Qudol identifies for expert review.
QUID* | Start | End | AssetCode | Name | Lineage |
---|---|---|---|---|---|
314253335 | 2019-03-01 | 2019-04-01 | 06128801 | EPSILON ENERGY LTD | CUSIP.MSCI-82869 |
314253335 | 2020-06-29 | 2999-12-31 | 09860201 | EPSILON ENERGY LTD | CUSIP.MSCI-82869 |
* Identifiers have been obfuscated for compliance. Names and dates are fictitious, but meaningful.
There is a gap in coverage from April 2019 to June 2020. Given that these are two different vendor assets, it is impossible to identify such gaps without QUID (which is the same for both assets). Qudol identifies this gap automatically.
Do we need to fill the gap? If so, which one do we stretch?
This is a business question that a researcher needs to answer. Qudol takes care of all of the data wrangling behind the scenes, so you can focus on just these cases that require expert intervention.
Start | End | Name | QUID* | Lineage |
---|---|---|---|---|
1994-07-01 | 1996-05-01 | CROWNX INC A NVTG | 113013971 | SEDOL.MSCI-61486.1997-01-01 |
1996-05-01 | 2006-11-02 | EXTENDICARE INC A SUBVTG | 113013971 | SEDOL.MSCI-61486.1997-01-01 |
2006-11-02 | 2006-11-09 | EXTENDICARE INC A | 113013971 | ISIN.MSCI-61486.2006-11-10 |
2006-11-09 | 2999-12-31 | EXTENDICARE REIT | 170729851 | CUSIP.MSCI-94251.2011-06-22 |
* Identifiers have been obfuscated for compliance. Names and dates are fictitious, but meaningful.
This is actually one of my favorite examples, and showcases what we really mean when we say that Qudol has unlimited history.
The above vendor asset (1399201) underwent multiple corporate actions, changing its name from CROWNX to various variations of EXTENDICARE. The vendor asset is tracking two different MSCI assets (61486, 94251) over time, starting in 1994. Extendicare REIT was recognized by the vendor in 2006, but not by MSCI until 2011.
Qudol's algorithm correctly tracks the vendor data against two different MSCI QUIDs over time. Even though the asset does not appear in MSCI with the proper CUSIP until 2011 (see highlighted Lineage column), it is correctly linked to vendor data history going back to 2006. Any analysis of MSCI asset 94251 will benefit from this deeper vendor data history.
By linking assets across vendors and time, Qudol not only builds a complete picture of the data, but also helps avoid selection bias for modeling and analytics.
These are just a few of the more common scenarios that I see while working with real customer and vendor data - there's plenty more! If you have any questions, please don't hesitate to reach out.
Matt Avella
Data Operations at Qudol
We're Ready to Talk
Learn more about the Qudol platform and applications today.
Book a demo