Teaching Machines to Spot What Matters


Financial institutions spend millions of hours annually on compliance alerts that lead nowhere, with false positive rates in some institutions surpassing 90%. Kevin Lee of Cygnus Compliance Consulting examines how AI and related technologies are poised to transform this inefficient landscape through supervised machine learning, automated narrations and more intelligent handling of unstructured data. 

Each year, millions of transaction monitoring, sanctions screening and Know Your Customer (KYC) alerts are adjudicated across the world by financial institutions and adjacent companies utilizing tens of millions of hours of manually intensive labor. The vast majority of these alerts are waived, with a false positive rate for some financial institutions estimated as high as 95%. 

As an industry, this lack of efficiency puts a strain on compliance operations teams, increases the cost of operations in meeting regulatory deadlines and introduces under-monitoring risk in the form of business attrition and adjudicator fatigue. The current solution of shifting compliance operations to lower-cost labor markets serves to manage costs but presents new issues in the form of contextual understanding of the underlying data that triggered the alert in the first place.

The modernization of compliance operations in the form of adoption of AI technologies like machine learning, natural language processing (NLP) and large language models (LLMs) remains in early stages. But several applications are clearly emerging to address the inefficiencies in today’s manually intensive, rules-based landscape.

Supervised machine learning

Today’s compliance technology solutions do not consistently utilize a model feedback mechanism to learn and improve from historical false positive data. False positive data is discarded, with the exception of manual tuning exercises that typically take place every one to two years. This is significant on a theoretical level but also on a functional level, given the proportion of false positives in alert data.

The current alert escalation construct is a classic candidate for supervised machine learning on a binary classification model, i.e. a model with only two outcomes — in this case, either suspicious or not suspicious. 

Supervised learning involves training a model on historical data that includes both labeled examples of suspicious activity and labeled non-suspicious activity. The model learns to distinguish between legitimate and potentially problematic activity by identifying patterns or features that correlate with each outcome. The model will become more accurate at assessing the risk of new alerts, based on the features that led to false positives in the past and user feedback on the model’s own historical decisions.

If an alert was previously deemed a false positive, that alert can be included as a labeled example for the model. By analyzing the reason the alert was flagged incorrectly (incorrect match with a sanctions list, normal customer behavior that was misinterpreted, etc.), the model learns to better distinguish between genuine risk and regular activity in future alerts. Once the model has learned from prior false positives, it can adjust the scoring for similar alerts, transactions or profiles, reducing the likelihood of flagging them again.

In the same way, as the model produces output (decisions on whether an alert is suspicious), that output can be labelled as either accurate or not accurate by an accountable human reviewer. Any output that is not accurate will prompt the human reviewer for reasons from a predefined list. That data will be weighed against other feedback, including output labeled as accurate, to teach the model.

var jnews_module_63957_1_67f5d2e867347 = {“header_icon”:””,”first_title”:””,”second_title”:””,”url”:””,”header_type”:”heading_6″,”header_background”:””,”header_secondary_background”:””,”header_text_color”:””,”header_line_color”:””,”header_accent_color”:””,”header_filter_category”:””,”header_filter_author”:””,”header_filter_tag”:””,”header_filter_cpt_wpm-testimonial-category”:””,”header_filter_text”:”All”,”sticky_post”:false,”post_type”:”post”,”content_type”:”all”,”sponsor”:false,”number_post”:”1″,”post_offset”:0,”unique_content”:”disable”,”include_post”:”63884″,”included_only”:”true”,”exclude_post”:””,”include_category”:””,”exclude_category”:””,”include_author”:””,”include_tag”:””,”exclude_tag”:””,”wpm-testimonial-category”:””,”sort_by”:”latest”,”date_format”:”default”,”date_format_custom”:”Y\/m\/d”,”excerpt_length”:”45″,”excerpt_ellipsis”:””,”force_normal_image_load”:””,”main_custom_image_size”:”default”,”pagination_mode”:”disable”,”pagination_nextprev_showtext”:””,”pagination_number_post”:4,”pagination_scroll_limit”:0,”ads_type”:”disable”,”ads_position”:1,”ads_random”:””,”ads_image”:””,”ads_image_tablet”:””,”ads_image_phone”:””,”ads_image_link”:””,”ads_image_alt”:””,”ads_image_new_tab”:””,”google_publisher_id”:””,”google_slot_id”:””,”google_desktop”:”auto”,”google_tab”:”auto”,”google_phone”:”auto”,”content”:””,”ads_bottom_text”:””,”el_id”:””,”el_class”:””,”scheme”:””,”column_width”:”auto”,”title_color”:””,”accent_color”:””,”alt_color”:””,”excerpt_color”:””,”block_background”:””,”css”:””,”paged”:1,”column_class”:”jeg_col_3o3″,”class”:”jnews_block_12″};

Eliminating manual processes

Today’s compliance alert adjudication processes are manually intensive, pulling from as many as a dozen data sources and producing narratives with inconsistent grammatical structure. Generative artificial intelligence (GenAI) and LLMs will auto-generate alerts narrations by reviewing the context of transaction, customer and counter party history, generating a detailed risk analysis, and then recommending whether an alert should be cleared or escalated. 

For instance, when an alert is triggered for a transaction involving a politically exposed person (PEP), a GenAI model can pull information from external news sources, customer data and previous transaction history to assess whether the activity is legitimate or if further investigation is required. It will then generate an adjudication narrative in one or two paragraphs that explains the reasoning behind the decision, which helps compliance officers agree or disagree with the output.

Likewise, GenAI can generate regulatory reports like suspicious activity reports or currency transaction records by pulling the relevant data from across systems and ensuring that necessary fields are correctly populated, reducing the risk of human error and improving regulatory compliance.

Data accountability

Over the past 15 years, the accountability of data at financial institutions has shifted to sit with compliance departments as data stewards, in large part due to the regulatory oversight of the model input of transaction monitoring and sanctions screening systems. 

This is particularly true at midsize and smaller financial institutions that do not have a dedicated, centralized data office and/or dedicated compliance technology function. As much as any other function, compliance is the custodian for end-to-end data of customer, transaction and other related information. The issue that has arisen is that the data — and the associated metadata (e.g. alert aging, data quality statistics) — that forms the compliance control framework is not readily accessible.

LLMs, with other evolving AI techniques, will take natural language prompts (e.g. “How many alerts are left to review this month?”) from compliance users and return real-time insight. This eliminates or significantly reduces the intermediary steps and technical staff needed to access data for compliance officers and compresses the turnaround time on acquiring vital information.

Dealing with unstructured data

Current rules-based systems work only with structured data, or data that can be organized in a standardized way, such as a spreadsheet or a table. Compliance programs rely on manual processes to review unstructured data (like emails, photos, documents or news articles) to form a complete picture of risk. As a result, compliance processes like transaction monitoring, sanctions screening or adverse media monitoring are fragmented, despite the fact the processes aim to achieve a common goal. 

NLP is a branch of AI that focuses on understanding and processing human language, particularly when dealing with unstructured data like customer communication, internet searches, transaction descriptions, third-party data and open-source intelligence. NLP can drive the convergence of structured and unstructured data to form a singular, customer-centric view of compliance risk across TM, sanctions screening, and KYC. 

This is done in several ways, including:

  • Sentiment analysis: NLP can be used to assess tone and behavioral patterns in documents, emails, internet search results and communications. For example, when analyzing correspondence with high-risk customers, the system could identify whether the tone or sentiment indicates an increased likelihood of money laundering, fraud or illicit activity, or if it’s simply routine business. The importance of sentiment analysis is heightened with the emerging prevalence of digital banking products and consequently, text based customer service.
  • Automated document verification: NLP can be combined with GenAI to automatically verify customer documents like passports, driver’s licenses or utility bills, to ensure their authenticity. AI models can also cross-check these documents against public databases, watchlists and sanctions lists in real-time to confirm that the customer’s identity and associated risk factors are accurate. Likewise, trade-based transactions that rely heavily on documentation like letters of credit can benefit from the same extraction and verification of data points. The data points can be structured and then be used to form an overall picture of risk, with little to no human intervention.
  • Automated Integration of external intelligence: By continuously ingesting data from external sources, GenAI can help generate dynamic KYC risk profiles and better interpret potential false positive data. For example, in sanctions screening, external intelligence can help determine if a name match on a sanctions list is relevant based on additional context like the customer’s latest location, which is above and beyond current capabilities.

GenAI, LLM, NLP and other AI technologies are fundamentally reshaping the landscape of AML, sanctions and KYC compliance. These technologies will enable financial institutions to automate routine processes, readily access data with natural language prompts, significantly improve efficiency and break down the silos that separate today’s compliance processes. They also support the growing responsibility of compliance departments as custodians of data lineage, ensuring that data is properly managed, traceable and compliant with regulations.

While there are challenges — such as regulatory acceptance, data privacy concerns and the need for transparency in AI decision-making — the benefits of AI in transforming compliance processes are clear and imminent. 

Financial institutions that successfully integrate AI into their compliance operations will not only be better equipped to manage regulatory risks, but will also lead the way in creating more efficient, scalable and effective systems for fighting financial crime.

The post Teaching Machines to Spot What Matters appeared first on Corporate Compliance Insights.

We will be happy to hear your thoughts

Leave a reply

Daily Deals
Logo
Register New Account
Compare items
  • Total (0)
Compare
0