Need to wrangle messy campaign names, clean URLs, or build smarter filters? Regex expressions in Looker Studio dashboards let you classify, extract, and normalize data without leaving your report. This guide gives you the exact patterns, functions, and workflows we use to make dashboards clearer, faster, and more useful for decision-making.
Below you’ll find practical Looker Studio regex examples, a marketer-friendly regex cheat sheet, and step-by-step instructions. We’ll cover REGEXP_CONTAINS, REGEXP_MATCH, REGEXP_EXTRACT, and REGEXP_REPLACE, plus language-specific tactics for international campaigns.
Functions You Need: REGEXP_CONTAINS, REGEXP_MATCH, REGEXP_EXTRACT, REGEXP_REPLACE
REGEXP_CONTAINS — Purpose, Syntax, Marketing Example
REGEXP_CONTAINS checks whether a field includes a pattern anywhere inside it. Use it for quick classification, segment filters, and broad matching. Syntax: REGEXP_CONTAINS(field, pattern) returns true or false, so it’s perfect for CASE statements and filter controls. Think of it as “does this text contain X?”.
Core uses:
- Brand vs. Non-Brand: REGEXP_CONTAINS(Query, "(?i)brand|branded")
- Channel grouping: REGEXP_CONTAINS(Source, "(?i)facebook|instagram|meta")
- UTM hygiene: REGEXP_CONTAINS(utm_campaign, "(?i)sale|promo|black(-|\\s)?friday")
Marketing example: Create a calculated field “Is Brand Search” with CASE WHEN REGEXP_CONTAINS(Query, "(?i)brandname|brand name|brnd") THEN "Brand" ELSE "Non-Brand" END. Use the (?i) flag for case-insensitive matching without editing every variation.
REGEXP_MATCH — Anchors And Full-String Matching
REGEXP_MATCH requires the entire string to match your pattern. It’s ideal when you need hard validation, strict formats, or exact categories. Anchors matter here: use ^ for start and $ for end. If you want partial matches, use REGEXP_CONTAINS instead.
When to use REGEXP_MATCH:
- Validate slugs: REGEXP_MATCH(Page Path, "^/[a-z0-9\\-]+$")
- Exact channel labels: REGEXP_MATCH(Default Channel Group, "(?i)organic search|paid search|direct")
- Clean product IDs: REGEXP_MATCH(Product ID, "^[A-Z]{3}-\\d{5}$")
Tip: If you really want to use MATCH for partial checks, wrap with .* like REGEXP_MATCH(Campaign, ".*(?i)sale.*"). But that’s slower and harder to maintain—prefer REGEXP_CONTAINS for partial logic.
REGEXP_EXTRACT — Capture Groups And Examples
REGEXP_EXTRACT pulls a substring that matches a specific capture group. Use parentheses ( ) to capture the part you want. If your pattern contains a capture group, the function returns that group; otherwise it returns the full match. Perfect for UTM, path, and event parsing.
Common extraction patterns for marketers:
- Source from UTM: REGEXP_EXTRACT(Campaign, "(?i)utm_source=([^]+)") returns the source value.
- Product category from path: REGEXP_EXTRACT(Page Path, "^/(?:[a-z]{2}/)?shop/([^/]+)/") captures the first category.
- Campaign prefix: REGEXP_EXTRACT(utm_campaign, "^(?i)(brand|non[-_]?brand|prospecting|retargeting)")
Pro move: Build “debug” fields with multiple REGEXP_EXTRACTs to check edge cases across samples before you commit to a final calculated dimension.
REGEXP_REPLACE — Normalization And Cleaning
REGEXP_REPLACE standardizes messy data by replacing a matched pattern with something cleaner. Syntax: REGEXP_REPLACE(field, pattern, replacement). Use it to fix inconsistent casing, remove tracking junk, or normalize naming schemes so your charts aren’t fragmented by small variations.
Useful replacements:
- Lowercase and unify sources: REGEXP_REPLACE(LOWER(utm_source), "(facebook|fb|meta)", "facebook")
- Strip querystrings: REGEXP_REPLACE(Landing Page, "\\?.*$", "")
- Unify campaign delimiters: REGEXP_REPLACE(utm_campaign, "[-_\\s]+", "_")
Combine with CASE for bulletproof normalization: clean first with REGEXP_REPLACE, then classify with REGEXP_CONTAINS or REGEXP_MATCH.
Which Function To Use When (Quick Decision Flow)
Pick the right Looker Studio regex function in seconds using this simple logic. Faster choices mean faster dashboards and fewer brittle formulas.
- Need a yes/no check inside a CASE? Use REGEXP_CONTAINS.
- Need exact, full-string validation? Use REGEXP_MATCH with anchors ^ and $.
- Need to pull a specific token? Use REGEXP_EXTRACT with capture groups ( ).
- Need to standardize values first? Use REGEXP_REPLACE, then classify or extract.
- Unsure? Start with CONTAINS for discovery, move to MATCH/EXTRACT once stable.
How To Add Regex In Looker Studio (Calculated Fields & Filters)
Creating A Calculated Field With Regex (Step-By-Step)
Adding regex expressions in Looker Studio dashboards is straightforward. You’ll create a calculated field or use regex directly in filter controls. Here’s a clean, repeatable setup any marketer can follow.
- Open your report and data source. If new to the interface, browse via Looker Studio reporting.
- In the Data panel, click “Add a Field”. Name it clearly (e.g., “Campaign Source (Normalized)”).
- Write your formula, e.g., REGEXP_REPLACE(LOWER(utm_source), "(facebook|fb|meta)", "facebook").
- Click “Validate” to check syntax. Fix any missing parentheses or quotes.
- Save. Add the field to a table so you can verify results across real rows.
Tip: Keep an “Original vs Normalized” table on a hidden page while you iterate. It saves time and prevents surprises in your public charts.
Using Regex In Chart Filters And Advanced Filters
Regex shines in filters when you want tight, predictable segments. You can filter a chart or the entire page using calculated fields or direct regex conditions.
- Chart-level filter with CONTAINS: Include only campaigns with “sale” or “promo” using REGEXP_CONTAINS(utm_campaign, "(?i)sale|promo").
- Advanced filter with MATCH: Show only clean product IDs using REGEXP_MATCH(Product ID, "^[A-Z]{3}-\\d{5}$").
- Exclude internal traffic: REGEXP_CONTAINS(Hostname, "(?i)staging|dev|internal") set to Exclude.
Pro tip: For discovery, start with inclusive filters using REGEXP_CONTAINS. Once you see real data patterns, move to stricter MATCH rules for clean, consistent segments.
Testing Expressions Inside The Report (Sample Rows & Debug Fields)
Small test loops prevent big headaches. Validate your Looker Studio regex with live rows and temporary fields before rolling it into production charts.
- Build a “Regex Sandbox” page with a table of raw fields plus your new regex fields.
- Add a “Test Notes” text box describing pattern assumptions and open questions.
- Create temporary fields: Test_CONTAINS, Test_MATCH, Test_EXTRACT to compare outcomes.
- Sort by unexpected results (e.g., blank extractions, false positives) to spot edge cases quickly.
Once you’re confident, hide the sandbox page. Keep a copy in your team template so new analysts can reuse the same workflow.
Marketing Cheat Sheet: Ready-To-Use Regex Patterns
UTM And Campaign Parsing (Normalize, Extract Source/Medium/Campaign)
Clean UTMs drive accurate attribution. Use these patterns to normalize and extract core pieces quickly in Looker Studio regex fields.
- Normalize source to Facebook: REGEXP_REPLACE(LOWER(utm_source), "(facebook|fb|meta)", "facebook")
- Normalize Google Ads: REGEXP_REPLACE(LOWER(utm_source), "(googleads|adwords|google_ads)", "google")
- Standardize medium: REGEXP_REPLACE(LOWER(utm_medium), "(cpc|ppc|paid)", "paid")
- Extract source: REGEXP_EXTRACT(utm_campaign, "(?i)utm_source=([^]+)")
- Extract medium: REGEXP_EXTRACT(utm_campaign, "(?i)utm_medium=([^]+)")
- Extract campaign: REGEXP_EXTRACT(utm_campaign, "(?i)utm_campaign=([^]+)")
- Campaign prefix classification: CASE WHEN REGEXP_CONTAINS(utm_campaign, "^(?i)(brand|non[-_]?brand|prospecting|retargeting)") THEN "Prefixed" ELSE "Other" END
If you’re scaling paid campaigns and need consistent, cross-channel reporting, pairing these with a naming convention beats manual cleanup. For broader reporting context, see our perspective on paid media performance.
URL And Path Parsing (Domain, Category, Product ID, Remove Querystring)
Pages splinter fast when URLs aren’t normalized. Strip querystrings, extract key path tokens, and group similar pages for stable trending.
- Remove querystring: REGEXP_REPLACE(Page, "\\?.*$", "")
- Domain from URL: REGEXP_EXTRACT(Landing Page, "https?://([^/]+)")
- Top-level folder: REGEXP_EXTRACT(Page Path, "^/(?:[a-z]{2}/)?([^/]+)/?")
- Product category: REGEXP_EXTRACT(Page Path, "^/(?:[a-z]{2}/)?shop/([^/]+)/")
- Product ID at end: REGEXP_EXTRACT(Page Path, "/([A-Z]{3}-\\d{5})$")
Add a “Canonical Path” calculated field that strips parameters and trailing slashes. Then build content groupings on top—your trendlines will stop zig-zagging.
Event And Label Parsing (Extract Action Or Label Tokens)
Event names carry rich signals for intent and conversion behavior. Extract the parts you care about—action type, component, or step number—directly into dimensions.
- GA-style event_action: REGEXP_EXTRACT(Event Name, "^(add_to_cart|view_item|purchase)")
- Step number: REGEXP_EXTRACT(Event Label, "(?:step|stap|etape)[\\s:_-]?(\\d+)")
- Component name: REGEXP_EXTRACT(Event Label, "component:([a-z_]+)")
- Video percent: REGEXP_EXTRACT(Event Label, "(\\d{1,3})%")
Use REGEXP_REPLACE to standardize inconsistent labels first, then REGEXP_EXTRACT to pull the token you’ll chart. That two-step pattern improves accuracy.
Common Validation Patterns (Email, IP, Pixel Sizes)
Quick validation protects your charts from junk data. Keep patterns simple and fast—don’t try to implement the full RFC spec inside a dashboard.
- Email (simplified): REGEXP_MATCH(Email, "^[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,}$") with (?i) for case-insensitive.
- IPv4: REGEXP_MATCH(IP, "^((25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\\.){3}(25[0-5]|2[0-4]\\d|[01]?\\d\\d?)$")
- Pixel size (e.g., 300x250): REGEXP_MATCH(Size, "^[1-9]\\d{1,4}x[1-9]\\d{1,4}$")
- EAN-13 format: REGEXP_MATCH(Barcode, "^\\d{13}$")
Validation is a guardrail, not a cure-all. If bad data keeps showing up, fix it at the source or your data layer.
Copy-Paste Snippets (Grouped By Use Case)
Grab these and go. Each line is designed for pasting into calculated fields or filters in Looker Studio regex workflows.
- Brand/non-brand: REGEXP_CONTAINS(Query, "(?i)brandname|brand name|brnd")
- Promo detection: REGEXP_CONTAINS(utm_campaign, "(?i)sale|promo|deal|black\\s?friday|cyber\\s?monday")
- Paid vs organic: CASE WHEN REGEXP_CONTAINS(LOWER(utm_medium), "(paid|cpc|ppc)") THEN "Paid" WHEN REGEXP_CONTAINS(LOWER(utm_medium), "organic|seo") THEN "Organic" ELSE "Other" END
- Canonical path: REGEXP_REPLACE(Page, "\\?.*$", "")
- Category from /shop/: REGEXP_EXTRACT(Page Path, "^/(?:[a-z]{2}/)?shop/([^/]+)/")
- Locale from path: REGEXP_EXTRACT(Page Path, "^/([a-z]{2})(?:-|/)?")
- Extract UTM source: REGEXP_EXTRACT(Full URL, "(?i)[?&]utm_source=([^]+)")
- Normalize Facebook variants: REGEXP_REPLACE(LOWER(utm_source), "(facebook|fb|meta)", "facebook")
Language-Based Regex
Language-based regex is very important—useful for international websites or international campaigns. If you sell across Belgium and the wider EU, your data will include accents, locale codes, and multi-language naming. Plan for it. The right patterns prevent fractured reporting and unlock cleaner insights.
Matching Accents And Diacritics (Unicode Classes And Examples)
You’ll see accents in names, page titles, and search terms. Match them intentionally. For Latin-based alphabets, include accented ranges or Unicode classes so your patterns don’t miss valuable traffic.
- Simple Latin range: REGEXP_CONTAINS(Term, "[A-Za-zÀ-ÿ]") to recognize letters with diacritics.
- Accent-friendly token: REGEXP_MATCH(Category, "^[A-Za-zÀ-ÿ0-9\\-\\s]+$")
- Name capture: REGEXP_EXTRACT(Author, "([A-Za-zÀ-ÿ'\\-]+)\\s([A-Za-zÀ-ÿ'\\-]+)")
If your dataset includes Cyrillic or Greek, widen patterns using explicit ranges for those scripts, or use more permissive character classes where it’s safe.
Locale Tokens And Multi-Language Campaign Names (en|fr|nl Examples)
International campaign naming often embeds locale tokens like en, fr, nl, de. Surface them so you can segment performance at a glance—without maintaining dozens of manual filters.
- Locale in campaign: REGEXP_EXTRACT(utm_campaign, "(?i)(?:^|[_-])(en|fr|nl|de)(?:[_-]|$)")
- Locale in path: REGEXP_EXTRACT(Page Path, "^/(en|fr|nl|de)(?:/|-)")
- Normalize locales: REGEXP_REPLACE(LOWER(Locale), "(be|be[-_]?(nl|fr))", "be") to collapse BE-NL and BE-FR to a single BE when appropriate.
Make the lowest-friction choice: use consistent separators in names (hyphen or underscore) so regex capture groups stay simple.
Strategy: Normalize Vs Match (Pick Based On Traffic And Scale)
Here’s the rule. If you’re scaling across languages and markets, normalize first, match second. If you’re exploring or testing, match first to learn, normalize later.
- Normalize when: multiple ad accounts, mixed UTM naming, or many locales create messy charts. Use REGEXP_REPLACE once, benefit everywhere.
- Match when: you’re testing or auditing a new feed. Use REGEXP_CONTAINS to map what’s out there.
- Hybrid: REPLACE core fields (source, medium, locale), then EXTRACT specific tokens for granular analysis.
Normalization reduces maintenance. It’s easier to teach your team to follow a naming template than to patch every dashboard filter forever.
Examples: Multi-Language UTM Normalization (Copyable Patterns)
Make international dashboards tidy with a few high-leverage fields. Copy, adapt, and reuse across reports.
- Campaign locale: REGEXP_EXTRACT(utm_campaign, "(?i)(?:^|[_-])(en|fr|nl|de|es|it)(?:[_-]|$)")
- Country token: REGEXP_EXTRACT(utm_campaign, "(?i)(?:^|[_-])(be|nl|fr|de|lu|ch|uk)(?:[_-]|$)")
- Normalize “retargeting”: REGEXP_REPLACE(LOWER(utm_campaign), "(retargeting|remarketing|rtg)", "retargeting")
- Paid social sources: REGEXP_REPLACE(LOWER(utm_source), "(facebook|fb|meta|instagram|ig)", "meta")
- Search brand tokens by language: REGEXP_CONTAINS(LOWER(Query), "(?i)marque|merk|brandname")
If global expansion is on your roadmap, align these patterns with your hreflang and international SEO choices. For a broader framework, explore our take on international SEO strategy.
Debugging, Testing And Performance Tips
Iterative Test Workflow (REGEXP_CONTAINS First)
Move fast, then tighten. Start broad with REGEXP_CONTAINS to discover real data patterns, then graduate to REGEXP_MATCH or REGEXP_EXTRACT once you’ve shaped the target.
- Inventory: Table key fields (source, medium, campaign, path, event).
- Probe: Add Test_CONTAINS fields to find common tokens and separators.
- Define: Lock a small set of naming rules; write final patterns.
- Harden: Replace CONTAINS with MATCH where exact rules matter.
A 20-minute discovery pass often saves hours of cleanup later.
Temporary Debug Fields And Sampling
Use debug fields to surface edge cases and false positives. Keep them in a hidden page or at the bottom of your report for quick checks during releases.
- Show “Raw vs Normalized” columns side by side.
- Add “Pattern Hit” booleans (e.g., CONTAINS and MATCH together) to compare behavior.
- Sample: Apply a date filter to last 7–14 days to test fresh data fast.
- Sort by blanks or error states to triage missing values.
Delete debug fields only after you’ve monitored them through a full campaign cycle. Seasonal naming variations can reveal new edge cases.
Performance Considerations By Connector (API/Sheets Vs BigQuery)
Regex can be compute-heavy. Choose where patterns run based on data size and connector behavior to keep your dashboards responsive.
- Sheets/API connectors: Keep patterns simple; avoid nested CASE with many regex calls. Pre-clean in the sheet when possible.
- BigQuery: Best for scale. Push transformations upstream or use source-side views so Looker Studio renders fewer calculated fields.
- Blended data: Minimize regex in blends. Normalize in each source first, then blend on clean, low-cardinality fields.
- Caching: Fewer, reusable calculated fields beat scattered one-offs across charts.
Short rule: The larger the dataset, the closer to the source your regex should run.
When To Use Regex In Studio Vs. In Your Data Layer
Decision Checklist (Reusability, Complexity, Maintenance)
Decide where to place regex by asking three questions. If the answer skews “yes” often, upstream it. Otherwise, keep it in Looker Studio for agility.
- Reusability: Will multiple reports need the same logic? If yes, move to the data layer or warehouse.
- Complexity: Does it require many capture groups or heavy nesting? If yes, push upstream for performance.
- Maintenance: Is it part of your naming standard? If yes, enforce via templates or automation before data hits the dashboard.
Keep lightweight exploration in Looker Studio. Bake standards and high-impact rules into your pipeline.
Migration Examples: Push To BigQuery When…
Here are clear triggers to move regex transformations upstream. A little effort in BigQuery or your ETL pays off with snappier dashboards.
- You maintain 10+ regex fields across multiple Looker Studio reports.
- You regularly hit timeouts on blended charts with regex filters.
- Your team needs one canonical definition of “Brand,” “Locale,” or “Content Group.”
- You want auditability and version control for naming rules.
If you’re building dashboards at scale, consider automating normalization rules. Our approach to marketing automation keeps naming consistent while freeing analysts to focus on insights.
Common Pitfalls And Limitations In Looker Studio Regex
Escaping And Backslash Rules
Watch your slashes. In calculated fields, pattern strings typically accept single backslashes, but replacement groups use \1, \2, etc. If you pasted patterns from other tools (Sheets, SQL), you may need to remove extra escaping or re-add anchors ^ and $ that got stripped.
- Escape literal dots: \\.
- Escape question marks: \\?
- Be explicit with boundaries: (^|[_-]) and ([_-]|$) for token edges.
If something “should match” but doesn’t, check escaping first, then anchors, then case sensitivity.
Case Sensitivity And Workarounds
Looker Studio regex is case-sensitive by default. Add the inline flag (?i) for case-insensitive patterns, or wrap fields with LOWER( ) and standardize values with REGEXP_REPLACE before matching.
- Case-insensitive contains: REGEXP_CONTAINS(Field, "(?i)sale")
- Normalize then match: REGEXP_MATCH(LOWER(Field), "brand|non[-_]?brand")
- Avoid fragile lists: Prefer broader tokens plus boundaries over dozens of case variants.
Consistency beats complexity. Normalize once, then match simply.
Known Function Limits And Gotchas
Looker Studio uses a safe regex engine that prioritizes performance and security. That means some advanced features available elsewhere aren’t supported here.
- No lookbehind assertions like (?<=...). Design patterns without them.
- No backreferences in patterns for complex recursion. Use capture and replace instead.
- Greedy vs. lazy can still bite: prefer explicit character classes and anchors over “.*”.
- Over-nesting CASE + regex can slow charts. Precompute in your data layer for heavy logic.
When in doubt, keep patterns clear, anchored, and easy to read. You’ll debug faster and avoid edge-case failures.
Regex Reference: Tokens You’ll Use Most (Quick Lookup)
Anchors, Quantifiers, Classes, Groups — One-Line Usage Examples
Keep this quick reference handy while you build. Each line shows the token, purpose, and a one-line Looker Studio regex example.
- ^ and $ (start/end): REGEXP_MATCH(Page Path, "^/[a-z0-9\\-]+$")
- . (any char): REGEXP_CONTAINS(Name, "A.B") matches AxB
- * + ? (quantifiers): "colou?r" matches color or colour
- {m,n} (range): "\\d{2,4}" matches 2 to 4 digits
- [] (class): "[A-Za-z0-9_]" alphanumerics + underscore
- [^] (negation): "[^/]" not a slash
- ( ) (group): "shop/([^/]+)/" captures category
- | (or): "(sale|promo|deal)" any of the three
- \\ (escape): "\\." literal dot
- \\d \\w \\s (shortcuts): "\\d+" digits; "\\s" whitespace
- Boundaries: "(^|[_-])en([_-]|$)" token en at word edges
- Flags: "(?i)" case-insensitive, e.g., "(?i)facebook"
Use anchors and explicit classes to control scope. Avoid “match anything” patterns unless you’re exploring.
Next Steps: Templates, Resources And How 6th Man Can Help
Downloadable Dashboard Field Templates (CTA)
Steal our field library. We’ve packaged the patterns above into ready-to-use calculated fields: UTM normalization, URL cleanup, campaign locale extraction, and brand/non-brand classification. Drop them into your next report and focus on insights, not syntax. For more on building strong reporting foundations, see our guide to SEO dashboards with Looker Studio.
Contact 6th Man For Embedded Analytics Support (Final CTA)
If you want regex expressions in Looker Studio dashboards that just work—clean, fast, and aligned to business outcomes—we can help. 6th Man operates as your embedded team, integrating analytics, automation, and growth strategy. Explore our digital marketing services and how we work, or contact us to get a dashboard you can trust.