1. Introduction
CIV.IQ aggregates legislative data from over 15 federal and state government APIs into a single queryable interface. This report documents the technical architecture, data integration challenges, and performance optimizations required to serve comprehensive civic information at scale.
The core challenge: government APIs are inconsistent in format, update frequency, rate limits, and data quality. Building a reliable platform requires abstracting these differences while maintaining data accuracy and freshness.
2. Data Sources
CIV.IQ integrates the following primary data sources:
| Source | Data Provided | Update Frequency |
|---|---|---|
| Congress.gov API | Members, bills, votes, committees | Daily |
| FEC.gov API | Campaign finance, contributions, expenditures | Quarterly filings |
| Census Bureau | Demographics, boundaries, geocoding | Annual/Decennial |
| OpenStates v3 | State legislators, bills, votes | Session-dependent |
| USAspending.gov | Federal contracts, grants | Daily |
| Federal Register | Executive orders, regulations | Daily |
| GovInfo | Hearings, committee reports | As published |
| Wikidata | Biographical data, state executives | Continuous |
3. Architecture
3.1 API Layer
CIV.IQ exposes 101 API endpoints, organized into logical domains:
- Federal Representatives — 35 endpoints covering member profiles, contact info, committee assignments, voting records, sponsored legislation
- Campaign Finance — 8 endpoints for FEC data including contributors, expenditures, industry breakdowns, geographic distribution
- Legislative Tracking — 15 endpoints for bills, amendments, cosponsors, status timelines
- State Government — 25 endpoints for state legislators, district maps, state bills
- Civic Engagement — 18 endpoints for hearings, comment periods, federal spending by district
3.2 Data Normalization
Each upstream API returns data in different formats. Congress.gov uses XML with nested structures. FEC returns paginated JSON. Census provides GeoJSON for boundaries. OpenStates uses a different member ID scheme than Congress.gov.
The normalization layer maps all sources to a unified schema:
// Unified Representative Schema
interface Representative {
id: string; // Internal CIV.IQ ID
bioguideId: string; // Congress.gov identifier
fecCandidateId: string; // FEC identifier
openStatesId?: string; // OpenStates identifier (state legs only)
name: {
first: string;
last: string;
official: string;
nickname?: string;
};
position: {
chamber: 'house' | 'senate';
state: string;
district?: number;
party: string;
startDate: string;
isVoting: boolean; // Distinguishes territorial delegates
};
contact: {
website: string;
phone: string;
office: string;
socialMedia: SocialLinks;
};
}
3.3 Geographic Resolution
ZIP code to congressional district mapping is non-trivial. ZIP codes are postal routes, not geographic boundaries — they can cross district lines. CIV.IQ uses the Census Bureau's ZIP Code Tabulation Areas (ZCTAs) crosswalked against congressional district shapefiles.
For the 39,495 searchable ZIP codes:
- 87% map to a single congressional district
- 11% span 2 districts (user sees both representatives)
- 2% span 3+ districts (major metro ZIPs)
4. Caching Strategy
Government APIs have strict rate limits (Congress.gov: 1000/hour, FEC: 1000/hour). Naive implementations would exhaust limits serving a few hundred users. CIV.IQ implements tiered caching with Next.js ISR (Incremental Static Regeneration) and Redis.
4.1 Revalidation Tiers
| Data Type | Revalidation Period | Rationale |
|---|---|---|
| Member biographical data | 1 week | Rarely changes mid-term |
| Committee assignments | 1 day | Changes at session start |
| Voting records | 1 hour | Updates during session |
| Bill status | 1 hour | Active legislation moves fast |
| Campaign finance | 1 day | FEC filings are periodic |
| News/GDELT | 5 minutes | Breaking coverage |
4.2 Cache Warming
On deployment, a background job pre-fetches and caches data for all 540 federal representatives. This ensures first-hit performance and reduces upstream API load during traffic spikes.
// Cache warming on deploy
async function warmCache() {
const members = await fetchAllMembers();
for (const member of members) {
await Promise.all([
cache.set(`member:${member.bioguideId}:profile`, ...),
cache.set(`member:${member.bioguideId}:votes`, ...),
cache.set(`member:${member.bioguideId}:bills`, ...),
cache.set(`member:${member.bioguideId}:finance`, ...),
]);
}
}
5. Constitutional Accuracy
A key design principle: CIV.IQ must accurately represent constitutional distinctions that other civic tools flatten. The 540 members of Congress are not equivalent:
- 435 House Representatives — Voting members per Article I
- 100 Senators — Voting members per Article I
- 5 Territorial Delegates — Non-voting, per Article IV (DC, Puerto Rico, Guam, US Virgin Islands, American Samoa)
- 1 Resident Commissioner — Puerto Rico's 4-year delegate
The UI explicitly labels voting status and cites the relevant constitutional provisions. ZIP code lookups in territories return delegates with appropriate context about their limited floor voting rights.
6. Performance Optimizations
6.1 Image Optimization
Official congressional photos (432 current members with available photos) were batch-converted from JPEG to WebP, reducing total asset size by 83%. Images are served via Next.js Image component with automatic format negotiation.
6.2 Map Tiles
District boundary visualization uses PMTiles format instead of traditional vector tile servers. The complete dataset (7,383 state legislative districts + 435 congressional districts) compresses to 24MB — a 75% reduction from the source shapefiles. Maps render client-side via MapLibre GL.
6.3 Static Data Generation
Slowly-changing reference data is pre-generated at build time:
- Census Gazetteer (ZIP to coordinate mapping)
- Committee metadata and Wikipedia descriptions
- State executive biographical data from Wikidata
- District demographic summaries
7. Error Handling
Government APIs have varying reliability. CIV.IQ's error handling philosophy: real data or explicit absence. Every endpoint either returns verified government data or a clear "Data unavailable" response with source attribution.
8. Future Work
- Historical data — Voting records and campaign finance for previous congresses
- Local expansion — City council data beyond the current 10 major cities
- Alert system — Notifications for bill status changes, upcoming votes, comment period deadlines
- API access — Public API for researchers and civic developers
9. Conclusion
Building a reliable civic intelligence platform requires treating government data aggregation as a serious engineering problem. The challenges — inconsistent APIs, rate limits, constitutional nuance, geographic complexity — are solvable with careful architecture. The result is a platform that makes democratic participation more informed without sacrificing accuracy for convenience.