Technical Foundation for Programmatic SEO
This article provides detailed content.
Programmatic SEO — generating hundreds or thousands of pages through template + data combinations — has become a strong traffic channel for B2B, e-commerce, and SaaS sites. But a poorly designed technical foundation can put the entire site's reputation at risk with Google's "low quality" classification. This article covers the right URL structure, entity signals, content schema, and tracking layer for programmatic SEO.
URL Structure: The First and Most Irreversible Decision
Programmatic SEO URLs are the starting point of all quality signals. A good URL structure:
- Hierarchical meaning:
/istanbul/mobile-app-developer— city / service structure - Short and clean: 3-5 segments. Query strings like
/?city=istanbul&service=mobileare bad - Canonical consistency: If multiple URLs can reach the same page, a single canonical
- Trailing slash decision: One or the other, never both
- Lowercase + hyphen: Avoid case-sensitive traps
Common mistake in programmatic generation: the same combination reachable from multiple URLs. If both /istanbul/kadikoy and /kadikoy/istanbul respond, duplicate content. Canonical headers partially fix this, but infrastructure discipline up front is better.
Entity Signals: Not Spam, Real Information
For Google to accept programmatic pages as "not spam", each page must represent a real entity. Entity signals:
- Unique data: Not just template, but page-specific data (local business count, average price, customer count)
- Relations: Meaningful internal links — other services in the same city, neighboring cities
- Structured data: Entity type clearly defined via schema.org (LocalBusiness, City, Service)
- External signals: Does this entity exist in the outside world? Wikidata, Google Knowledge Graph links
An example: if a page exists for "Istanbul mobile app developer", it must contain rare information about mobile development in Istanbul — local startup density, typical pricing bands, case studies — not just a template.
Content Schema: Variable vs Constant
Programmatic pages contain two content types:
- Variable: Page-specific — city name, service type, statistics
- Constant: Shared across pages — about the site, generic quality statements
The ratio of constant content in a page matters. Pages with 80% constant + 20% variable signal duplicate content. Target ratio: at least 40-50% variable content.
Sources for enriching variable content:
- Internal data (user count, transaction volume)
- Third-party APIs (Google Places, weather, statistical agencies, OpenStreetMap)
- AI generation + human editor: AI drafts, editor fact-checks
- User-generated content (reviews, questions)
Tracking Layer: What Are You Measuring?
Programmatic SEO success isn't measured by page count but by traffic and conversion per page. Tracking layer:
- Indexation rate: What percentage of generated pages are indexed by Google? 80%+ is healthy
- Impression/page: Share of pages receiving impressions in Search Console
- Click-through: Impression → click conversion
- Bounce rate: Did the user arrive and leave?
- Conversion: Form submit, lead, sign-up rate
Critical implementation: IndexNow API + automated sitemap submission. Each new programmatic page is pushed to Bing and Yandex via IndexNow; Google Search Console sitemap refreshes daily.
Quality Threshold: When to No-Index?
Not all programmatic pages should be indexed. Low-quality pages can drag the whole site down. Rules for no-indexing:
- Less than 300 words of variable content
- Impression + click near zero after 90 days
- No unique information beyond the template
- Entity cannot be verified (fake city, non-existent service)
This filtering should happen every 6 months. Post-2024 "helpful content updates" from Google harshly penalize low-value programmatic pages.
Practical Infrastructure Architecture
Core components of a programmatic SEO system:
- Data source: Combination of CMS, database, spreadsheet, API
- Template engine: Next.js dynamic routes, Astro content collections, or custom render
- Build pipeline: ISR (incremental) or full SSG; scheduled rebuilds
- CDN + cache: Page cache via Cloudflare/CloudFront
- Monitoring: Search Console + GA4 + custom dashboard
- Quality gate: Min word count, entity verification, unique score check before publish
Tolga Ege - Senior Mobile & Web Developer, Founder of CreativeCode
Mobile App, Web Development, AI, SaaS