Set Theory

An interactive guide — the foundation of everything in math and CS

1. What Is a Set?

A set is a collection of distinct things. That's it. No order, no duplicates. The "things" are called elements or members.

You use sets every day without thinking about it:

Your Spotify playlist is a set of songs
Your shopping cart is a set of items
Users who liked a post — that's a set of user IDs
The permissions a role has — a set of strings

$$A = \{1, 2, 3, 4, 5\}$$

A set of the first 5 positive integers. Curly braces = "this is a set."

Key properties that make a set a set:

No duplicates: $\{1, 2, 2, 3\} = \{1, 2, 3\}$. If you add a duplicate, nothing changes.
No order: $\{3, 1, 2\} = \{1, 2, 3\}$. The arrangement doesn't matter.
Membership is binary: Something is either in the set or it isn't. No "kind of in the set."

Real-World: Database Query Results

When you run SELECT DISTINCT user_id FROM orders, you get a set. The DISTINCT keyword literally enforces the set property — no duplicates.

In Python: user_ids = set(row['user_id'] for row in orders)

In JavaScript: const userIds = new Set(orders.map(o => o.userId))

Every programming language has a Set data structure because sets are that fundamental.

Real-World: Git — Changed Files

When you run git diff --name-only, you get a set of filenames that changed. Each file appears at most once, order doesn't matter — you care about which files changed, not in what sequence.

2. Set Notation & Membership

Math has compact notation for sets. Here's the Rosetta Stone:

$x \in A$ — "$x$ is an element of $A$" (x belongs to A)

$x \notin A$ — "$x$ is NOT an element of $A$"

$|A|$ — "the size of $A$" (how many elements, also called cardinality)

$\emptyset$ or $\{\}$ — the empty set (a set with nothing in it)

Two Ways to Define a Set

1. Roster notation — just list the elements:

$$\text{Vowels} = \{a, e, i, o, u\}$$

2. Set-builder notation — describe a rule:

$$\text{EvenNumbers} = \{x \mid x \text{ is an integer and } x \mod 2 = 0\}$$

Read the "|" as "such that". This says: "the set of all x, such that x is an integer and x is divisible by 2."

Real-World: Filtering Lists in Code

Set-builder notation is exactly a filter/comprehension:

Math: $B = \{x \in \text{Users} \mid x.\text{age} \geq 18\}$

Python: B = {u for u in users if u.age >= 18}

SQL: SELECT * FROM users WHERE age >= 18

JS: const B = new Set(users.filter(u => u.age >= 18))

Same idea, different syntax. The math notation is actually shorter than most code.

Common Number Sets

Mathematicians use special letters for number sets you'll see everywhere:

$\mathbb{N}$ = Natural numbers $\{0, 1, 2, 3, \ldots\}$ (some definitions start at 1)
$\mathbb{Z}$ = Integers $\{\ldots, -2, -1, 0, 1, 2, \ldots\}$ (Z for "Zahlen", German for numbers)
$\mathbb{Q}$ = Rational numbers (fractions like $\frac{3}{4}$)
$\mathbb{R}$ = Real numbers (all points on the number line, including $\pi$, $\sqrt{2}$)

3. Types of Sets

Empty Set

$\emptyset = \{\}$ — a set with zero elements. Think: the result of a search query that matches nothing.

$$|\emptyset| = 0$$

Real-World: Empty Search Results

Search "asdfghjklqwerty" on Amazon. You get $\emptyset$ — an empty set of products. The set exists (it's a valid container), it just has nothing in it.

In code: results = set() # valid, but empty

Important: $\emptyset \neq \{\emptyset\}$. An empty bag is different from a bag containing an empty bag!

Finite vs Infinite Sets

Finite: $\{1, 2, 3\}$ — you can count the elements and stop. Size = 3.
Infinite: $\mathbb{N} = \{0, 1, 2, 3, \ldots\}$ — keeps going forever.

In programming, we only ever deal with finite sets (memory is finite). But infinite sets matter in theory — they're why algorithms have complexity classes.

Universal Set

The universal set $U$ is "everything we're currently talking about." It depends on context:

If we're talking about employees at a company, $U$ = all employees
If we're talking about integers from 1-100, $U = \{1, 2, \ldots, 100\}$
If we're talking about valid HTTP status codes, $U = \{100, 101, \ldots, 599\}$

4. Subsets & Supersets

If every element of $A$ is also in $B$, then $A$ is a subset of $B$:

$$A \subseteq B \iff \text{every } x \in A \implies x \in B$$

And $B$ is a superset of $A$: $B \supseteq A$.

If $A$ is a subset but $A \neq B$ (meaning $B$ has extra stuff), it's a proper subset: $A \subset B$.

Quick examples:

$\{1, 2\} \subseteq \{1, 2, 3\}$ ✓
$\{1, 2, 3\} \subseteq \{1, 2, 3\}$ ✓ (every set is a subset of itself)
$\emptyset \subseteq A$ for ANY set $A$ (the empty set is a subset of everything)
$\{1, 4\} \subseteq \{1, 2, 3\}$? ✗ (4 is not in the second set)

Real-World: Role-Based Access Control (RBAC)

In any app with roles and permissions:

$\text{ViewerPerms} = \{\text{read}\}$

$\text{EditorPerms} = \{\text{read}, \text{write}\}$

$\text{AdminPerms} = \{\text{read}, \text{write}, \text{delete}, \text{manage\_users}\}$

Notice: $\text{ViewerPerms} \subset \text{EditorPerms} \subset \text{AdminPerms}$

That's exactly how permission hierarchies work. Checking "does this user have permission X?" is literally checking set membership: $X \in \text{UserPerms}$.

Checking "can a Viewer do everything an Editor can?" is checking: $\text{EditorPerms} \subseteq \text{ViewerPerms}$? No — so Viewers can't do all Editor tasks.

Real-World: TypeScript Types

TypeScript's type system is literally set theory:

type Animal = { name: string }

type Dog = { name: string; breed: string }

Dog is a subtype of Animal because every Dog has everything an Animal has (and more). In set theory terms: the set of all possible Dog objects is a subset of the set of all possible Animal objects.

Union types (string | number) and intersection types (A & B) are literally set union and intersection. TypeScript IS applied set theory.

5. Union

The union of two sets is everything that's in either set (or both):

$$A \cup B = \{x \mid x \in A \text{ OR } x \in B\}$$

Think of it as merging two lists and removing duplicates.

Example:

$$\{1, 2, 3\} \cup \{3, 4, 5\} = \{1, 2, 3, 4, 5\}$$

The 3 appears in both, but only once in the union.

Properties of union:

Commutative: $A \cup B = B \cup A$
Associative: $(A \cup B) \cup C = A \cup (B \cup C)$
Identity: $A \cup \emptyset = A$ (adding nothing changes nothing)
Idempotent: $A \cup A = A$

Real-World: Merging Search Results

Google searches multiple indexes. Each index returns a set of results. The final results page is the union of all those sets.

results = google_web_results | google_news_results | google_scholar_results

In Python, the | operator on sets is literally union: set_a | set_b

In SQL: SELECT ... UNION SELECT ... gives you the union (with automatic dedup).

Real-World: Notification Preferences

$\text{EmailNotifs} = \{\text{orders}, \text{promotions}, \text{security}\}$

$\text{PushNotifs} = \{\text{messages}, \text{orders}, \text{reminders}\}$

$\text{AllNotifTypes} = \text{EmailNotifs} \cup \text{PushNotifs} = \{\text{orders}, \text{promotions}, \text{security}, \text{messages}, \text{reminders}\}$

"Show me all the types of notifications this user gets through any channel" = union.

6. Intersection

The intersection is everything that's in BOTH sets:

$$A \cap B = \{x \mid x \in A \text{ AND } x \in B\}$$

Example:

$$\{1, 2, 3\} \cap \{3, 4, 5\} = \{3\}$$

If two sets have no elements in common, their intersection is empty: $A \cap B = \emptyset$. These sets are called disjoint.

Properties:

Commutative: $A \cap B = B \cap A$
Associative: $(A \cap B) \cap C = A \cap (B \cap C)$
Identity: $A \cap U = A$ (intersecting with everything gives you back yourself)
Annihilation: $A \cap \emptyset = \emptyset$

Real-World: Finding Common Friends

Facebook's "mutual friends" feature is literally set intersection:

$\text{YourFriends} \cap \text{TheirFriends} = \text{MutualFriends}$

In Python: mutual = your_friends & their_friends (the & operator on sets)

In SQL: SELECT friend_id FROM your_friends INTERSECT SELECT friend_id FROM their_friends

Real-World: Feature Flags

$\text{EnabledFlags} = \{\text{dark\_mode}, \text{new\_checkout}, \text{ai\_search}\}$

$\text{BetaFlags} = \{\text{new\_checkout}, \text{ai\_search}, \text{voice\_nav}\}$

$\text{EnabledFlags} \cap \text{BetaFlags} = \{\text{new\_checkout}, \text{ai\_search}\}$

"Which beta features are currently enabled?" = intersection.

7. Difference & Complement

Set Difference

$A \setminus B$ (also written $A - B$) is everything in $A$ that's NOT in $B$:

$$A \setminus B = \{x \mid x \in A \text{ AND } x \notin B\}$$

Example:

$$\{1, 2, 3, 4, 5\} \setminus \{3, 4\} = \{1, 2, 5\}$$

Unlike union and intersection, difference is NOT commutative: $A \setminus B \neq B \setminus A$ (usually).

Symmetric Difference

$A \triangle B$ is everything in exactly one of the sets, but NOT in both:

$$A \triangle B = (A \setminus B) \cup (B \setminus A) = (A \cup B) \setminus (A \cap B)$$

Complement

The complement of $A$ (written $A'$ or $\overline{A}$ or $A^c$) is everything in the universal set $U$ that's NOT in $A$:

$$A^c = U \setminus A = \{x \in U \mid x \notin A\}$$

Real-World: Unread Emails

$U$ = all emails in your inbox

$\text{Read}$ = set of emails you've opened

$\text{Unread} = U \setminus \text{Read} = \text{Read}^c$

Your "unread count" is $|U \setminus \text{Read}|$.

Real-World: Git Diff

$\text{FilesOnMain} = \{a.js, b.js, c.js, d.js\}$

$\text{FilesOnBranch} = \{a.js, b.js, e.js, f.js\}$

Files added on branch: $\text{FilesOnBranch} \setminus \text{FilesOnMain} = \{e.js, f.js\}$

Files deleted on branch: $\text{FilesOnMain} \setminus \text{FilesOnBranch} = \{c.js, d.js\}$

Files that changed (added or deleted): $\text{FilesOnMain} \triangle \text{FilesOnBranch} = \{c.js, d.js, e.js, f.js\}$

Files unchanged: $\text{FilesOnMain} \cap \text{FilesOnBranch} = \{a.js, b.js\}$

8. Interactive Venn Diagrams

Venn diagrams visualize set operations. Enter elements for sets A and B, then see the result of each operation:

A: B:

Operation:

9. Cartesian Product

The Cartesian product $A \times B$ is the set of ALL possible ordered pairs — one element from $A$ and one from $B$:

$$A \times B = \{(a, b) \mid a \in A \text{ AND } b \in B\}$$

Example:

$$\{1, 2\} \times \{x, y\} = \{(1,x), (1,y), (2,x), (2,y)\}$$

Size: $|A \times B| = |A| \cdot |B|$. With 2 elements and 2 elements, you get 4 pairs.

This extends to more sets: $A \times B \times C$ gives you ordered triples $(a, b, c)$.

Real-World: Database JOINs

A CROSS JOIN in SQL is literally the Cartesian product:

SELECT * FROM sizes CROSS JOIN colors

If sizes = {S, M, L} and colors = {Red, Blue}, you get 6 rows: (S,Red), (S,Blue), (M,Red), (M,Blue), (L,Red), (L,Blue).

This is why CROSS JOINs can be dangerous — if both tables have 10,000 rows, the result has 100,000,000 rows. $10000 \times 10000 = 10^8$.

Other JOINs (INNER, LEFT, etc.) are Cartesian products followed by a filter (WHERE clause).

Real-World: E-commerce Product Variants

$\text{Sizes} = \{S, M, L, XL\}$

$\text{Colors} = \{\text{Black}, \text{White}, \text{Navy}\}$

$\text{AllVariants} = \text{Sizes} \times \text{Colors}$

Total variants = $4 \times 3 = 12$: (S,Black), (S,White), (S,Navy), (M,Black), ...

Add a third dimension: $\text{Materials} = \{\text{Cotton}, \text{Polyester}\}$

$\text{AllSKUs} = \text{Sizes} \times \text{Colors} \times \text{Materials}$, giving $4 \times 3 \times 2 = 24$ SKUs.

Interactive: Cartesian Product Builder

A: B:

10. Power Set

The power set $\mathcal{P}(A)$ is the set of ALL possible subsets of $A$ — including the empty set and $A$ itself:

$$\text{If } A = \{1, 2, 3\}, \text{ then:}$$ $$\mathcal{P}(A) = \{\emptyset, \{1\}, \{2\}, \{3\}, \{1,2\}, \{1,3\}, \{2,3\}, \{1,2,3\}\}$$

Size: $|\mathcal{P}(A)| = 2^{|A|}$. A set with 3 elements has $2^3 = 8$ subsets.

Why $2^n$? For each element, you have a binary choice: include it or not. That's $n$ yes/no decisions, giving $2^n$ combinations.

Real-World: Feature Combinations in A/B Testing

You have 4 features to test: $\{\text{dark\_mode}, \text{new\_nav}, \text{ai\_chat}, \text{redesigned\_cart}\}$.

The power set gives you ALL possible combinations = $2^4 = 16$ test groups, including "no features" (control) and "all features."

This is why testing all combinations of $n$ features grows exponentially. With 10 features, you'd need $2^{10} = 1024$ test groups. With 20 features: over a million. This is exactly why full factorial testing is impractical and we use partial factorial designs.

Interactive: Power Set Generator

Set:

11. De Morgan's Laws

These two laws are the most important rules connecting union, intersection, and complement. They show up everywhere — in SQL, boolean logic, circuit design, and everyday coding.

Law 1: $(A \cup B)^c = A^c \cap B^c$

"NOT (A OR B)" is the same as "(NOT A) AND (NOT B)"

Law 2: $(A \cap B)^c = A^c \cup B^c$

"NOT (A AND B)" is the same as "(NOT A) OR (NOT B)"

In English:

Law 1: "Not in either" = "Not in A, and not in B"
Law 2: "Not in both" = "Not in A, or not in B"

Real-World: Simplifying If-Statements

You write this guard clause:

if (!(isAdmin || isModerator)) { deny(); }

By De Morgan's Law 1, this is identical to:

if (!isAdmin && !isModerator) { deny(); }

Both mean "if the user is neither an admin nor a moderator, deny access."

Another one:

if (!(hasTicket && hasID)) { reject(); }

By Law 2:

if (!hasTicket || !hasID) { reject(); }

"If they don't have both a ticket and ID" = "If they're missing a ticket, or missing ID, or both."

Real-World: SQL WHERE Clauses

WHERE NOT (status = 'active' OR role = 'admin')

is equivalent to (by De Morgan):

WHERE status != 'active' AND role != 'admin'

Query optimizers use De Morgan's laws automatically to rewrite queries into more efficient forms.

Interactive: De Morgan's Truth Table

Pick an element and see how both sides of De Morgan's laws evaluate to the same thing:

x ∈ A? x ∈ B?

x ∈ A	x ∈ B	(A∪B)ᶜ	Aᶜ ∩ Bᶜ	Match?	(A∩B)ᶜ	Aᶜ ∪ Bᶜ	Match?

12. Cardinality & Counting

Cardinality = size of a set. $|A|$ is how many elements are in $A$.

The most important counting formula in set theory is the Inclusion-Exclusion Principle:

$$|A \cup B| = |A| + |B| - |A \cap B|$$

You can't just add the sizes — you'd count the overlap twice. Subtract it once to fix.

For three sets:

$$|A \cup B \cup C| = |A| + |B| + |C| - |A \cap B| - |A \cap C| - |B \cap C| + |A \cap B \cap C|$$

|A|: 15 |B|: 12 |A∩B|: 5

Real-World: Survey Analysis

A survey of 200 developers:

120 know JavaScript
90 know Python
50 know both

How many know at least one? $|JS \cup Py| = 120 + 90 - 50 = 160$

How many know neither? $200 - 160 = 40$

How many know only JavaScript? $120 - 50 = 70$

How many know only Python? $90 - 50 = 40$

Check: $70 + 40 + 50 + 40 = 200$ ✓ (only-JS + only-Py + both + neither = total)

Real-World: Bug Tracking

Your bug tracker shows:

$|A|$ = 45 bugs in the frontend
$|B|$ = 30 bugs in the backend
$|A \cap B|$ = 8 bugs that affect both (full-stack bugs)

Total unique bugs: $|A \cup B| = 45 + 30 - 8 = 67$

Without inclusion-exclusion, you'd report 75 bugs. 8 of those are duplicates counted twice.

Summary: The Cheat Sheet

Operation	Notation	Code	Meaning
Union	$A \cup B$	a \| b	In A or B (or both)
Intersection	$A \cap B$	a & b	In both A and B
Difference	$A \setminus B$	a - b	In A but not B
Symmetric Diff	$A \triangle B$	a ^ b	In one but not both
Complement	$A^c$	u - a	Not in A
Subset	$A \subseteq B$	a <= b	Every element of A is in B
Cardinality	$\|A\|$	len(a)	Number of elements
Cartesian Product	$A \times B$	itertools.product(a,b)	All pairs
Power Set	$\mathcal{P}(A)$	combinations loop	All subsets

← Back to all guides