Natural language processing (NLP) explained

Egress | 21st Feb 2022

People and machines speak different languages. Over the decades, many companies attempted to bridge the gap with programs and tools. Still, traditional methods always fell short of giving computers the ability to understand and analyze human expression. 

This gap is shrinking with the disruptive emergence of advanced artificial intelligence (AI) and machine learning (ML). A relatively new field of study emerged, leveraging powerful algorithms to allow machines to make decisions or respond by drawing on human language information. It's called natural language processing (NLP). 

What is NLP?

Natural language processing falls under the field of computer science and AI. Its purpose is to develop and test new ways for computers to understand written and spoken words in a manner that is similar to humans. It uses a combination of computational linguistics, machine learning, and statistical modeling to enable a computer to understand the intended meaning of a human expression, whether in text or voice. 

Human language and communication are pretty complex constructs with many facets and layers. In English alone, we see the same words have different meanings based on tone, inflection, pronunciation, sentence structure, and context. None of these words have any inherent meaning for systems that interpret electrical voltages as 0s and 1s. 

To overcome this, programmers integrate speech recognition technologies and implement specialized tasks into NLP programs. Capabilities like speech tagging to teach the program to understand meaning based on grammar, use, and context help these solutions deliver a much higher degree of language understanding than was previously possible. 

NLP technology has some obvious applications. These include translations, digital assistants, home automation, directions, automated help, and directions. One less so obvious use case, though, is in cybersecurity. 

How is it applied to cybersecurity?

NLP carries dramatic implications on both the attack and defense sides of cybersecurity. From the attacker's point of view, NLP can help gather information on potential targets from social media or other publicly available information. 

That helps generate and automate highly sophisticated and tailored attacks, like an enhanced version of spear-phishing. An attacker could even use NLP to impersonate trusted individuals to gather sensitive information. 

Why is it helpful for stopping phishing?

Let's look at spear-phishing again, but this time from the point of view of securing critical systems. Spear-phishing is an attack method where malicious actors send tailored messages to gather sensitive information from a potential victim. If malicious actors can use this technology to create more sophisticated threats, cybersecurity practitioners must consider how they can leverage NLP to protect their systems. 

An MIT position paper developed by Michael C. Kotson and Alexia Schulz highlights how cyber practitioners can use NLP technologies to identify and profile spear-phishing attacks and the attackers themselves. These systems can determine whether an email is random spam or part of a coordinated attack and even home in on the types of information the attackers seek to compromise. 

That could be valuable information for the cyber team, taking additional measures to protect the data in question. In contrast, traditional cybersecurity tools like secure email gateways often fall short in understanding the contextual implications of text-based email attacks. 

NLP's ability to process language into actionable insights has powerful, real-world implications in the cyber arena. A group of university students recently collaborated to publish an academic paper on how they used NLP to thwart the efforts of malicious social engineers. The study found that all real-world phishing attacks contain at least one of two characteristics: 

  1. The malicious communication will ask a question whose answer is private or sensitive, or 
  2. It will issue a command to perform an action that harms the victim or business. 

With this information, they could leverage NLP to understand when these questions or orders were present in emails and react accordingly. 

Stop spear-phishing attacks with Egress Defend

Egress Defend combines NLP with zero-trust models and advanced machine learning to help organizations detect and neutralize even the most sophisticated phishing attacks. Real-time, contextualized alerts, with guidance and feedback in plain language, helps ease the burden on cyber teams by transforming everyday employees into cybersecurity assets. It accomplishes this with zero administrative overhead and no rules or quarantine to manage. 

To find out more, visit our product page, or schedule a demo with one of our experts.