# Email Address

# RFC 5322 Check

This check verifies that the email address is syntactically correct according to the RFC 5322 standard, which defines the proper format for internet email addresses. It ensures the address has a valid local part and domain, no forbidden characters or sequences (such as consecutive dots), and uses correct placement of symbols like @ and .. By enforcing these rules, the system filters out clearly invalid emails before running more advanced checks.

Examples of emails that will fail this check:

Email Reason
jane..doe@example.com Double dot in local part
.john@example.com Leading dot
john@.com Invalid domain
john@ Missing domain
john doe@example.com Unescaped space

This check influences the fraud score as follows:

Rule Description Fraud Score
email.rfc5322 The email address failed to pass the RFC 5322 validation +100

# Disposable Domain

This check identifies whether the email address uses a disposable or temporary email service, such as Mailinator, 10MinuteMail, or Guerrilla Mail. Disposable emails are often used to bypass free tier limits or create fraudulent accounts because they allow users to quickly generate throwaway inboxes. The system compares the domain part of the email (example.com) against a maintained list of known disposable domains and providers.

Examples of emails that will fail this check:

  • user@mailinator.com
  • temp@guerrillamail.com
  • signup@10minutemail.com
  • fake@trashmail.com
  • quick@dispostable.com

This check influences the fraud score as follows:

Rule Description Fraud Score
email.disposable The email address belongs to a disposable email service +10

# Role-Based Address Check

This check detects whether the email address is a generic role-based account instead of belonging to an individual user. Addresses like admin@, support@, and info@ are commonly shared by teams or used for automated communication rather than personal sign-ups. Blocking or flagging these helps prevent abuse, ensures accountability, and reduces the risk of low-quality registrations.

Examples of emails that will fail this check:

  • support@example.com
  • info@example.com
  • admin@example.com
  • contact@example.com
  • sales@example.com

This check influences the fraud score as follows:

Rule Fraud Score
email.dummy_role +5

# Known Provider Check

This check detects whether the email address belongs to a well-known email provider, such as Gmail, Outlook, Yahoo, etc. These domains are widely trusted and consistently configured, so when a known provider is detected, domain-related checks (like MX records and domain age) are skipped to avoid redundant checks. This check does not affect the fraud_score in any way.

# Separator Check

This check analyzes the use of separators in the local part of the email address to detect patterns often associated with fraudulent or auto-generated accounts. It performs three validations:

  1. Identifies whether the local part is split into fragments by dots (.) or hyphens (-), such as j.o.h.n.d.o.e or j-o-h-n-d-o-e.
  2. Detects consecutive separators, for example .., --, or __, which are commonly used to evade filters.
  3. Calculates the overall separator density and flags the address if separators exceed 30% of the local part's length.

This check influences the fraud score as follows:

Rule Fraud Score
email.separator_abuse +10
email.consecutive_separator +10
email.separator_density > 0.3 +5
email.repeated_chars +5

# Tag Check

This check analyzes the tag portion of an email address, defined as the substring appearing after the first separator character (+, ., _, or -) in the local part of the email. A tag is considered suspicious if it is at least 8 characters long and has entropy equal to or greater than 3, indicating likely random or scripted generation. Entropy is computed using Shannon entropy over the characters in the tag, measuring how unpredictable or varied the string is.

For example, the following emails will fail this check because their tags are both long and high-entropy:

  • user+3f9xQz8p@gmail.com
  • admin.8fjsklqp@company.org
  • test_user-znv93kdj@example.com
  • signup.qp9xk7v2@domain.net

This check influences the fraud score as follows:

Rule Description Fraud Score
email.suspicious_tag length ≥ 8, entropy ≥ 3 +5

# Composition Check

The Composition Check validates whether an email address's local part (everything before the "@") is likely to be artificial, suspicious, or non-human. It enforces several rules to ensure the local part is well-formed and natural-looking. Specifically, the check fails if the local part is too short (<2 chars) or too long (>30 chars), if it consists only of digits, or if more than 50% of its characters are numeric.

It also flags addresses containing five consecutive digits, no vowels at all, or unusually high entropy (>4), which suggests random or automated generation. Additional validations include rejecting addresses with repeated characters (like "aaaaaa"), mixed writing scripts (Latin, Greek, Cyrillic combined), or emojis in the local part.

Below are examples of email addresses that will fail the Composition Check:

Email Reason
1@domain.com Too short
1234567890@domain.com Only digits
john123456@domain.com Five digits in a row
x9q2z5k1v8s4d0@domain.com High entropy
bcd@domain.com No vowels
aaaaaaaaaaa@domain.com Repeated characters
μαθήματα@domain.com Mixed scripts (Greek + Latin)
smile😊@domain.com Contains emoji

This check influences the fraud score as follows:

Rule Description Fraud Score
email.name_too_short length < 2 +5
email.name_too_long length > 30 +10
email.all_digits only digits +10
email.large_digit_ratio over 50% digits +5
email.five_digits_in_a_row five digits in a row +5
email.lacks_vowels no vowels +10
email.random_local entropy > 4 +5
email.repeated_chars five repeated characters +5
email.mixed_scripts mixed scripts (latin, greek, cyrillic) +10
email.with_emoji contains emojis +10