Architecture
About the CIA World Factbook Data
The US Central Intelligence Agency publishes an almanac-style reference on the countries of the world known as the CIA World Factbook. There's a wealth of data, and you can learn a lot on the site. I urge you to explore it and drill into the detail. Happily, this data is in the public domain which means we can use it for our own purposes. Note however that you are not permitted to replicate the agency seal; and naturally, you should give proper attribution if you use the data.
The data on the site is not particularly approachable for software purposes, but fortunately a gentleman named Ian Coleman has seen fit to create a JSON edition of the data, which is what we'll be using as our data source. It comes as one big 14MB JSON file, but we'll divide that into a JSON record per country (260 of them).
What We're Building
Today in Part 1 we have two goals:- Get the country data into a DynamoDB table.
- Create Lamba functions for accessing the data.
Country Data in Dynamo DB
Then, we'll create Lamba Functions are accessing the data at various levels:
Data Retrieval via API & Lambda Function
With the above accomplished, it will be smooth sailing to create user interfaces to the data.
Loading the Data
We want our data both in S3 and in DynamoDB. Let's take a look at our source data, a JSON country record:{ "name": "Bermuda", "key": "bermuda", "timestamp": "Monday, February 11, 2019 4:09:28 PM", "source": "Factbook", "introduction": { "background": "Bermuda was first settled in 1609 by shipwrecked English colonists heading for Virginia. Self-governing since 1620, Bermuda is the oldest and most populous of the British overseas territories. Vacationing to the island to escape North American winters first developed in Victorian times. Tourism continues to be important to the island's economy, although international business has overtaken it in recent years. Bermuda has also developed into a highly successful offshore financial center. A referendum on independence from the UK was soundly defeated in 1995." }, "geography": { "location": "North America, group of islands in the North Atlantic Ocean, east of South Carolina (US)", "geographic_coordinates": { "latitude": { "degrees": 32, "minutes": 20, "hemisphere": "N" }, "longitude": { "degrees": 64, "minutes": 45, "hemisphere": "W" } }, "map_references": "North America", "area": { "total": { "value": 54, "units": "sq km" }, "land": { "value": 54, "units": "sq km" }, "water": { "value": 0, "units": "sq km" }, "global_rank": 232, "comparative": "about one-third the size of Washington, DC" }, "land_boundaries": { "total": { "value": 0, "units": "km" } }, "coastline": { "value": 103, "units": "km" }, "maritime_claims": { "territorial_sea": { "value": 12, "units": "nm" }, "exclusive_fishing_zone": { "value": 200, "units": "nm" } }, "climate": "subtropical; mild, humid; gales, strong winds common in winter", "terrain": "low hills separated by fertile depressions", "elevation": { "lowest_point": "Atlantic Ocean", "79_highest_point": "Town Hill" }, "natural_resources": { "resources": [ "limestone", "pleasant climate fostering tourism" ] }, "land_use": { "by_sector": { "agricultural_land_total": { "value": 14.8, "units": "%" }, "arable_land": { "value": 14.8, "units": "%", "note": "/" }, "permanent_crops": { "value": 0, "units": "%", "note": "/" }, "permanent_pasture": { "value": 0, "units": "%" }, "forest": { "value": 20, "units": "%" }, "other": { "value": 65.2, "units": "%" } }, "date": "2011" }, "population_distribution": "relatively even population distribution throughout", "natural_hazards": [ { "description": "hurricanes (June to November)", "type": "hazard" } ], "environment": { "current_issues": [ "dense population and heavy vehicle traffic create serious congestion and air pollution problems", "water resources scarce (most obtained as rainwater or from wells)", "solid waste disposal", "hazardous waste disposal", "sewage disposal", "overfishing", "oil spills" ] } }, "people": { "population": { "total": 71176, "global_rank": 203, "date": "2018-07-01" }, "nationality": { "noun": "Bermudian(s)", "adjective": "Bermudian" }, "ethnic_groups": { "ethnicity": [ { "name": "black", "percent": 53.8 }, { "name": "white", "percent": 31 }, { "name": "mixed", "percent": 7.5 }, { "name": "other", "percent": 7.1 }, { "name": "unspecified", "percent": 0.6 } ], "date": "2010" }, "languages": { "language": [ { "name": "English", "note": "official" }, { "name": "Portuguese" } ] }, "religions": { "religion": [ { "name": "Protestant", "percent": 46.2, "breakdown": [ { "name": "includes Anglican", "percent": 15.8 }, { "name": "African Methodist Episcopal", "percent": 8.6 }, { "name": "Seventh Day Adventist", "percent": 6.7 }, { "name": "Pentecostal", "percent": 3.5 }, { "name": "Methodist", "percent": 2.7 }, { "name": "Presbyterian", "percent": 2 }, { "name": "Church of God", "percent": 1.6 }, { "name": "Baptist", "percent": 1.2 }, { "name": "Salvation Army", "percent": 1.1 }, { "name": "Brethren", "percent": 1 }, { "name": "other Protestant", "percent": 2 } ] }, { "name": "Roman Catholic", "percent": 14.5 }, { "name": "Jehovah's Witness", "percent": 1.3 }, { "name": "other Christian", "percent": 9.1 }, { "name": "Muslim", "percent": 1 }, { "name": "other", "percent": 3.9 }, { "name": "none", "percent": 17.8 }, { "name": "unspecified", "percent": 6.2 } ], "date": "2010" }, "age_structure": { "0_to_14": { "percent": 16.92, "males": 6088, "females": 5957 }, "15_to_24": { "percent": 11.95, "males": 4306, "females": 4197 }, "25_to_54": { "percent": 36.56, "males": 13049, "females": 12972 }, "55_to_64": { "percent": 16.04, "males": 5383, "females": 6034 }, "65_and_over": { "percent": 18.53, "males": 5596, "females": 7594 }, "date": "2018" }, "median_age": { "total": { "value": 43.5, "units": "years" }, "male": { "value": 41.5, "units": "years" }, "female": { "value": 45.4, "units": "years" }, "global_rank": 18, "date": "2018" }, "population_growth_rate": { "growth_rate": 0.43, "global_rank": 158, "date": "2018" }, "birth_rate": { "births_per_1000_population": 11.3, "global_rank": 172, "date": "2018" }, "death_rate": { "deaths_per_1000_population": 8.7, "global_rank": 71, "date": "2018" }, "net_migration_rate": { "migrants_per_1000_population": 1.8, "global_rank": 50, "date": "2017" }, "population_distribution": "relatively even population distribution throughout", "urbanization": { "urban_population": { "value": 100, "units": "%", "date": "2018" }, "rate_of_urbanization": { "value": -0.44, "units": "%" } }, "major_urban_areas": { "places": [ { "place": "Hamilton", "population": 10000, "is_capital": true } ], "date": "2018" }, "sex_ratio": { "by_age": { "at_birth": { "value": 1.02, "units": "males/female" }, "0_to_14_years": { "value": 1.02, "units": "males/female" }, "15_to_24_years": { "value": 1.01, "units": "males/female" }, "25_to_54_years": { "value": 1, "units": "males/female" }, "55_to_64_years": { "value": 0.89, "units": "males/female" }, "65_years_and_over": { "value": 0.73, "units": "males/female" } }, "total_population": { "value": 0.94, "units": "males/female" }, "date": "2017" }, "infant_mortality_rate": { "total": { "value": 2.5, "units": "deaths_per_1000_live_births" }, "male": { "value": 2.6, "units": "deaths_per_1000_live_births" }, "female": { "value": 2.4, "units": "deaths_per_1000_live_births" }, "global_rank": 217, "date": "2018" }, "life_expectancy_at_birth": { "total_population": { "value": 81.5, "units": "years" }, "male": { "value": 78.3, "units": "years" }, "female": { "value": 84.7, "units": "years" }, "global_rank": 26, "date": "2018" }, "total_fertility_rate": { "children_born_per_woman": 1.92, "global_rank": 128, "date": "2018" }, "education_expenditures": { "percent_of_gdp": 1.5, "global_rank": 175, "date": "2017" }, "school_life_expectancy": { "total": { "value": 12, "units": "years" }, "male": { "value": 11, "units": "years" }, "female": { "value": 12, "units": "years" }, "date": "2015" }, "youth_unemployment": { "total": { "value": 29.3, "units": "%" }, "male": { "value": 29.7, "units": "%" }, "female": { "value": 29, "units": "%" }, "global_rank": 35, "date": "2014" } }, "government": { "country_name": { "conventional_long_form": "none", "conventional_short_form": "Bermuda", "former": "Somers Islands", "etymology": "the islands making up Bermuda are named after Juan de BERMUDEZ, an early 16th century Spanish sea captain and the first European explorer of the archipelago" }, "government_type": "parliamentary democracy (Parliament); self-governing overseas territory of the UK", "capital": { "name": "Hamilton", "geographic_coordinates": { "latitude": { "degrees": 32, "minutes": 17, "hemisphere": "N" }, "longitude": { "degrees": 64, "minutes": 47, "hemisphere": "W" } }, "time_difference": { "timezone": -4, "note": "1 hour ahead of Washington, DC, during Standard Time" }, "daylight_saving_time": "+1hr, begins second Sunday in March; ends first Sunday in November" }, "administrative_divisions": [ { "name": "Devonshire", "type": "" }, { "name": "Hamilton", "type": "" }, { "name": "Hamilton", "type": "" }, { "name": "Paget", "type": "" }, { "name": "Pembroke", "type": "" }, { "name": "Saint George", "type": "" }, { "name": "Saint George's", "type": "" }, { "name": "Sandys", "type": "" }, { "name": "Smith's", "type": "" }, { "name": "Southampton", "type": "" }, { "name": "Warwick", "type": "" } ], "independence": { "note": "overseas territory of the UK" }, "national_holidays": [ { "name": "Bermuda Day", "day": "24 May", "note": "formerly known as Victoria Day, Empire Day, and Commonwealth Day" } ], "constitution": { "history": "several previous (dating to 1684); latest entered into force 8 June 1968 (Bermuda Constitution Order 1968) (2018)", "amendments": "proposal procedure - NA; passage by an Order in Council in the UK; amended several times, last in 2012 (2018)" }, "legal_system": "English common law", "international_law_organization_participation": [ "has not submitted an ICJ jurisdiction declaration", "non-party state to the ICCt" ], "citizenship": { "citizenship_by_birth": "no", "citizenship_by_descent_only": "at least one parent must be a citizen of the UK", "dual_citizenship_recognized": "yes", "residency_requirement_for_naturalization": "10 years" }, "suffrage": { "age": 18, "universal": true, "compulsory": false }, "executive_branch": { "chief_of_state": "Queen ELIZABETH II (since 6 February 1952); represented by Governor John RANKIN (since 5 December 2016)", "head_of_government": "Premier David BURT (since 19 July 2017)", "cabinet": "Cabinet nominated by the premier, appointed by the governor", "elections_appointments": "the monarchy is hereditary; governor appointed by the monarch; following legislative elections, the leader of the majority party or majority coalition usually appointed premier by the governor" }, "legislative_branch": { "description": "bicameral Parliament consists of: Senate (11 seats; 3 members appointed by the governor, 5 by the premier, and 3 by the opposition party; members serve 5-year terms) and the House of Assembly (36 seats; members directly elected in single-seat constituencies by simple majority vote to serve up to 5-year terms)\nHouse of Assembly (36 seats; members directly elected in single-seat constituencies by simple majority vote to serve up to 5-year terms)", "elections": "Senate - last appointments in August 2017 (next appointments in 2022)\nHouse of Assembly - last held on 18 July 2017 (next to be held not later than 2022)", "election_results": "Senate - composition - men 7, women 4, percent of women 36.4%\nHouse of Assembly - percent of vote by party - PLP 58.9%, OBA 40.6%, other 0.5%; seats by party - PLP 24, OBA 12; composition - men 28, women 8, percent of women 22.2%; note - total Parliament percent of women 25.5%" }, "judicial_branch": { "highest_courts": "Court of Appeal (consists of the court president and at least 2 justices); Supreme Court (consists of the chief justice, 4 puisne judges, and 1 associate justice); note - the Judicial Committee of the Privy Council in London is the court of final appeal", "judge_selection_and_term_of_office": "Court of Appeal justice appointed by the governor; justice tenure by individual appointment; Supreme Court judges nominated by the Judicial and Legal Services Commission and appointed by the governor; judge tenure based on terms of appointment", "subordinate_courts": "commercial court (began in 2006); magistrates' courts" }, "political_parties_and_leaders": { "parties": [ { "name": "One Bermuda Alliance", "name_alternative": "OBA", "note": "vacant" }, { "name": "Progressive Labor Party", "name_alternative": "PLP", "leaders": [ "E. David BURT" ] } ] }, "international_organization_participation": [ { "organization": "Caricom ", "note": "associate" }, { "organization": "ICC ", "note": "NGOs" }, { "organization": "Interpol ", "note": "subbureau" }, { "organization": "IOC" }, { "organization": "ITUC ", "note": "NGOs" }, { "organization": "UPU" }, { "organization": "WCO" } ], "diplomatic_representation": { "from_united_states": { "chief_of_mission": "Consul General Mary Ellen KOENIG (since 28 November 2015)", "mailing_address": "P. O. Box HM325, Hamilton HMBX; American Consulate General Hamilton, US Department of State, 5300 Hamilton Place, Washington, DC 20520-5300", "telephone": "[1] (441) 295-1342", "fax": "[1] (441) 295-1592, 296-9233", "consulates_general": "Crown Hill, 16 Middle Road, Devonshire DVO3" } }, "flag_description": { "description": "red, with the flag of the UK in the upper hoist-side quadrant and the Bermudian coat of arms (a white shield with a red lion standing on a green grassy field holding a scrolled shield showing the sinking of the ship Sea Venture off Bermuda in 1609) centered on the outer half of the flag; it was the shipwreck of the vessel, filled with English colonists originally bound for Virginia, that led to the settling of Bermuda", "note": "the flag is unusual in that it is only British overseas territory that uses a red ensign, all others use blue" }, "national_symbol": { "symbols": [ { "symbol": "red lion" } ] }, "national_anthem": { "name": "Hail to Bermuda", "lyrics_music": "Bette JOHNS", "note": "serves as a local anthem; as a territory of the United Kingdom, \"God Save the Queen\" is official (see United Kingdom)" } }, "economy": { "overview": "International business, which consists primarily of insurance and other financial services, is the real bedrock of Bermuda's economy, consistently accounting for about 85% of the island's GDP. Tourism is the country’s second largest industry, accounting for about 5% of Bermuda's GDP but a much larger share of employment. Over 80% of visitors come from the US and the sector struggled in the wake of the global recession of 2008-09. Even the financial sector has lost roughly 5,000 high-paying expatriate jobs since 2008, weighing heavily on household consumption and retail sales. Bermuda must import almost everything. Agriculture and industry are limited due to the small size of the island.\nBermuda's economy returned to negative growth in 2016, reporting a contraction of 0.1% GDP, after growing by 0.6% in 2015. Unemployment reached 7% in 2016 and 2017, public debt is growing and exceeds $2.4 billion, and the government continues to work on attracting foreign investment. Still, Bermuda enjoys one of the highest per capita incomes in the world.", "gdp": { "purchasing_power_parity": { "annual_values": [ { "value": 6127000000, "units": "USD", "date": "2016" }, { "value": 6133000000, "units": "USD", "date": "2015" }, { "value": 6097000000, "units": "USD", "date": "2014" } ], "global_rank": 172 }, "official_exchange_rate": { "USD": 6127000000, "date": "2016" }, "real_growth_rate": { "annual_values": [ { "value": -0.1, "units": "%", "date": "2016" }, { "value": 0.6, "units": "%", "date": "2015" }, { "value": -0.3, "units": "%", "date": "2014" } ], "global_rank": 198 }, "per_capita_purchasing_power_parity": { "annual_values": [ { "value": 99400, "units": "USD", "date": "2016" }, { "value": 95500, "units": "USD", "date": "2015" }, { "value": 87500, "units": "USD", "date": "2014" } ], "global_rank": 6 }, "composition": { "by_end_use": { "end_uses": { "household_consumption": { "value": 51.3, "units": "%" }, "government_consumption": { "value": 15.7, "units": "%" }, "investment_in_fixed_capital": { "value": 13.7, "units": "%" }, "investment_in_inventories": { "value": 0, "units": "%" }, "exports_of_goods_and_services": { "value": 49.8, "units": "%" }, "imports_of_goods_and_services": { "value": -30.4, "units": "%" } }, "date": "2017" }, "by_sector_of_origin": { "sectors": { "agriculture": { "value": 0.9, "units": "%" }, "industry": { "value": 5.3, "units": "%" }, "services": { "value": 93.8, "units": "%" } }, "date": "2017" } } }, "agriculture_products": { "products": [ "bananas", "vegetables", "citrus", "flowers", "dairy products", "honey" ] }, "industries": { "industries": [ "international business", "tourism", "light manufacturing" ] }, "industrial_production_growth_rate": { "annual_percentage_increase": 2, "global_rank": 129, "date": "2017" }, "labor_force": { "total_size": { "total_people": 33480, "global_rank": 202, "date": "2016" }, "by_occupation": { "occupation": { "agriculture": { "value": 2, "units": "%" }, "industry": { "value": 13, "units": "%" }, "services": { "value": 85, "units": "%" } }, "date": "2016" } }, "unemployment_rate": { "annual_values": [ { "value": 7, "units": "%", "date": "2017" }, { "value": 7, "units": "%", "date": "2016" } ], "global_rank": 106 }, "population_below_poverty_line": { "value": 11, "units": "%", "date": "2008" }, "household_income_by_percentage_share": {}, "budget": { "revenues": { "value": 999200000, "units": "USD" }, "expenditures": { "value": 1176000000, "units": "USD" }, "date": "2017" }, "taxes_and_other_revenues": { "percent_of_gdp": 16.3, "global_rank": 183, "date": "2017" }, "budget_surplus_or_deficit": { "percent_of_gdp": -2.9, "global_rank": 127, "date": "2017" }, "public_debt": { "annual_values": [ { "value": 43, "units": "percent_of_gdp" } ], "global_rank": 117 }, "fiscal_year": { "start": "1 April", "end": "31 March" }, "inflation_rate": { "annual_values": [ { "value": 1.9, "units": "%", "date": "2017" }, { "value": 1.4, "units": "%", "date": "2016" } ], "global_rank": 96 }, "stock_of_narrow_money": { "annual_values": [ { "value": 3374000000, "units": "USD", "date": "2014-09-30" }, { "value": 3422000000, "units": "USD", "date": "2013-12-31" } ], "global_rank": 118, "note": "figures do not include US dollars, which also circulate freely" }, "stock_of_broad_money": { "annual_values": [ { "value": 22100000000, "units": "USD", "date": "2014-09-30" }, { "value": 25100000000, "units": "USD", "date": "2013-12-31" } ], "global_rank": 67 }, "stock_of_domestic_credit": { "note": "NA" }, "market_value_of_publicly_traded_shares": { "annual_values": [ { "value": 1850000000, "units": "USD", "date": "2015-12-31" }, { "value": 1601000000, "units": "USD", "date": "2014-12-31" }, { "value": 1467000000, "units": "USD", "date": "2013-12-31" } ], "global_rank": 100 }, "current_account_balance": { "annual_values": [ { "value": 818600000, "units": "USD", "date": "2017" }, { "value": 763000000, "units": "USD", "date": "2016" } ], "global_rank": 53 }, "exports": { "total_value": { "annual_values": [ { "value": 19000000, "units": "USD", "date": "2017" }, { "value": 19000000, "units": "USD", "date": "2016" } ], "global_rank": 210 }, "commodities": { "by_commodity": [ "reexports of pharmaceuticals" ] }, "partners": { "by_country": [ { "name": "Jamaica", "percent": 49.1 }, { "name": "Luxembourg", "percent": 36.1 }, { "name": "US", "percent": 4.9 } ], "date": "2017" } }, "imports": { "total_value": { "annual_values": [ { "value": 1094000000, "units": "USD", "date": "2017" }, { "value": 980000000, "units": "USD", "date": "2016" } ], "global_rank": 183 }, "commodities": { "by_commodity": [ "clothing", "fuels", "machinery", "transport equipment", "construction materials", "chemicals", "food", "live animals" ] }, "partners": { "by_country": [ { "name": "US", "percent": 72.1 }, { "name": "South Korea", "percent": 9.7 }, { "name": "Canada", "percent": 4.2 } ], "date": "2017" } }, "external_debt": { "annual_values": [ { "value": 2515000000, "units": "USD", "date": "2017" }, { "value": 2435000000, "units": "USD", "date": "2015" } ], "global_rank": 150 }, "stock_of_direct_foreign_investment": { "at_home": { "annual_values": [ { "value": 2641000000, "units": "USD", "date": "2014" }, { "value": 2664000000, "units": "USD", "date": "2013" } ], "global_rank": 116 }, "abroad": { "annual_values": [ { "value": 889000000, "units": "USD", "date": "2014" }, { "value": 835000000, "units": "USD", "date": "2013" } ], "global_rank": 90 } }, "exchange_rates": { "annual_values": [ { "value": 1, "units": "USD", "date": "2017" }, { "value": 1, "units": "USD", "date": "2016" }, { "value": 1, "units": "USD", "date": "2015" }, { "value": 1, "units": "USD", "date": "2014" }, { "value": 1, "units": "USD", "date": "2013" } ], "note": "Bermudian dollars (BMD) per US dollar" } }, "energy": { "electricity": { "access": { "total_electrification": { "value": 100, "units": "%" }, "date": "2016" }, "production": { "kWh": 650000000, "global_rank": 159, "date": "2016" }, "consumption": { "kWh": 604500000, "global_rank": 166, "date": "2016" }, "exports": { "kWh": 0, "global_rank": 107, "date": "2016" }, "imports": { "kWh": 0, "global_rank": 126, "date": "2016" }, "installed_generating_capacity": { "kW": 171000, "global_rank": 169, "date": "2016" }, "by_source": { "fossil_fuels": { "percent": 100, "global_rank": 3, "date": "2016" }, "nuclear_fuels": { "percent": 0, "global_rank": 50, "date": "2017" }, "hydroelectric_plants": { "percent": 0, "global_rank": 158, "date": "2017" }, "other_renewable_sources": { "percent": 0, "global_rank": 176, "date": "2017" } } }, "crude_oil": { "production": { "bbl_per_day": 0, "global_rank": 110, "date": "2017" }, "exports": { "bbl_per_day": 0, "global_rank": 94, "date": "2015" }, "imports": { "bbl_per_day": 0, "global_rank": 97, "date": "2015" }, "proved_reserves": { "bbl": 0, "global_rank": 107, "date": "2018-01-01" } }, "refined_petroleum_products": { "production": { "bbl_per_day": 0, "global_rank": 119, "date": "2017" }, "consumption": { "bbl_per_day": 5000, "global_rank": 178, "date": "2016" }, "exports": { "bbl_per_day": 0, "global_rank": 131, "date": "2015" }, "imports": { "bbl_per_day": 3939, "global_rank": 178, "date": "2015" } }, "natural_gas": { "production": { "cubic_metres": 0, "global_rank": 105, "date": "2017" }, "consumption": { "cubic_metres": 0, "global_rank": 122, "date": "2017" }, "exports": { "cubic_metres": 0, "global_rank": 70, "date": "2017" }, "imports": { "cubic_metres": 0, "global_rank": 92, "date": "2017" }, "proved_reserves": { "cubic_metres": 0, "global_rank": 111, "date": "2014-01-01" } }, "carbon_dioxide_emissions_from_consumption_of_energy": { "megatonnes": 793700, "global_rank": 174, "date": "2017" } }, "communications": { "telephones": { "fixed_lines": { "total_subscriptions": 21883, "subscriptions_per_one_hundred_inhabitants": 31, "global_rank": 173, "date": "2017" }, "mobile_cellular": { "total_subscriptions": 64997, "subscriptions_per_one_hundred_inhabitants": 92, "global_rank": 198, "date": "2017" }, "system": { "general_assessment": "a good, fully automatic digital telephone system with fiber-optic trunk lines; telecom sector provides a relatively high contribution to overall GDP; numerous competitors licensed, but small and localized (2017)", "domestic": "the system has a high fixed-line teledensity 31 per 100, coupled with a mobile-cellular teledensity of roughly 92 per 100 persons (2017)", "international": "country code - 1-441; landing points for the GlobeNet, Gemini Bermuda, CBUS, and the Challenger Bermuda-1 (CB-1) submarine cables; satellite earth stations - 3 (2015)" } }, "broadcast_media": "3 TV stations; cable and satellite TV subscription services are available; roughly 13 radio stations operating (2012)", "internet": { "country_code": ".bm", "users": { "total": 69126, "percent_of_population": 98, "global_rank": 181, "date": "2016-07-01" } } }, "transportation": { "air_transport": { "civil_aircraft_registration_country_code_prefix": { "prefix": "VP-B", "date": "2016" }, "airports": { "total": { "airports": 1, "global_rank": 214, "date": "2013" }, "paved": { "total": 1, "2438_to_3047_metres": 1, "date": "2017" } } }, "roadways": { "total": { "value": 447, "units": "km" }, "paved": { "value": 447, "units": "km" }, "note": "225 km public roads; 222 km private roads", "global_rank": 138, "date": "2010" }, "merchant_marine": { "total": 160, "by_type": [ { "type": "bulk carrier", "count": 10 }, { "type": "container ship", "count": 8 }, { "type": "general cargo", "count": 1 }, { "type": "oil tanker", "count": 18 }, { "type": "other", "count": 123 } ], "global_rank": 72, "date": "2017" }, "ports_and_terminals": { "major_seaports": [ "Hamilton", "Ireland Island", "Saint George" ] } }, "military_and_security": { "branches": { "by_name": [ "Bermuda Regiment" ], "date": "2012" }, "service_age_and_obligation": { "years_of_age": 18, "note": "18-45 years of age for voluntary male or female enlistment in the Bermuda Regiment; males must register at age 18 and may be subject to conscription; term of service is 38 months for volunteers or conscripts", "date": "2012" }, "note": "defense is the responsibility of the UK" }, "transnational_issues": { "disputes": [ "none" ] } }
JSON Country Record
We can see the data is very detailed, and also a strong fit for DynamoDB: our "records" are JSON documents with many levels of data. We've also added 3 fields of our own: key, timestamp, and source. Key is a derivative of the country name suitable for using as a filename or general key; it's the name converted to lower case, with some characters removed (commas, parentheses), and some characters replaced with underscore (spaces, hyphens). Thus the key for "United States" is "united_states". Timestamp is when the data was last collected. Source is just "Factbook"; we add it because DynamoDB expects a field of the document to map to a partition key.
Loading Files into S3
S3 will hold, for each country, the country JSON record as well as image files for flag and map. We don't really need the country JSON in S3 for this project (since we're going to query DynamoDB for country data), but we're going to be importing the JSON from S3 as a staging location when we insert the data into DynamoDB.I've already retrieved the JSON data and split it into 260 separate country JSON records previously, as well as the flag and map image files. All were originally stored in Azure blob storage. You can get a blow-by-blow account of that here.
To copy over the country JSON and image files, I first downloaded the Azure blobs using my Azure Storage Explorer tool; and then uploaded them to S3 by dragging them into the AWS S3 console. Here's what our end-result in S3 looks like:
Country files in S3
armenia-map.gif
We now have a JSON document for each country, as well as a flag and map image for each country:
armenia.gif
Loading DynamoDB
Next, we want to get our country data into DynamoDB, one country document per country. To do that, we create a DynamoDB table in the AWS Console named factbook. DynamoDB requires us to think about partition key and sort key, which collectively form our unique key to a record. Although our country document records are very deep, the actual number of records is small: 260. Accordingly, we will use the same partition key ("Factbook") for all of our records. The source field we added to the JSON contains this value, so our partition key field is source. For sort key, we'll use country name, captured in the name field.
Creating DynamoDB Table
In my original project, I wrote a durable function which ran on a timer once a week, processing 260 country records in parallel. We may do the same for AWS at some point, but today we'll be more modest: we'll develop a Lambda Function to create a country record in DynamoDB. The function will be called via HTTP with a key parameter, which will be a country key such as "afghanistan" or "united_kingdom". The function will read the country's
Lambda Function to Load DynamoDB Country Record
Our load-country function, written in Node.js, first retrieves the JSON file from our S3 bucket (lines 28-44); the function has a role assigned whose policy grants access to our factbook-data S3 bucket as well our Factbook dynamoDB table. We next replace empty strings with nulls because DynamoDB does not allow empty strings. Next we parse it into an actual JSON variable so we can work with it (line 65). The code adds three housekeeping properties to the original json: key (country key), timestamp, and source ("Factbook") at lines 78-80.
// load-country : load a country record // // This function retrieves a JSON country record for the specified key from S3, // and inserts a document into the factbook DynamoDB table. // inputs: // key parameter: country key, such as "united_states" // https://s3.amazonaws.com/factbook-data/*.json must exist const http = require('http'); exports.handler = function(event, context, callback) { const AWS = require('aws-sdk'); AWS.config.update({region: 'us-east-1'}); const docClient = new AWS.DynamoDB.DocumentClient({region: 'us-east-1'}); // Get country key from HTTP query parameter. var key = event["queryStringParameters"]['key']; //var key = "antigua_and_barbuda"; // <= for in-portal testing // Retrieve.json from s3 var url = 'http://s3.amazonaws.com/factbook-data/' + key + '.json'; console.log("01 http.get " + url); return http.get(url, function(response) { // Continuously update stream with data var body = ''; response.on('data', function(d) { body += d; }); response.on('end', function() { console.log('02 on end'); // Data reception is done, do whatever with it! // replace empty strings ("") with null because DynamoDB disallowes empty strings body = replace(body, 'type: ""', 'type: null'); body = replace(body, '"type": ""', '"type": null'); body = replace(body, 'name_alternative: ""', 'name_alternative": null'); body = replace(body, '"name_alternative": ""', '"name_alternative": null'); body = replace(body, 'note: ""', 'note: null'); body = replace(body, '"note": ""', '"note": null'); body = replace(body, 'foreign_based: ""', 'foreign_based: null'); body = replace(body, '"foreign_based": ""', '"foreign_based": null'); // parse body text into a JSON object var data = null; try { data = JSON.parse(body); console.log('03 parsed'); console.log(data.name); } catch(e) { console.log('03-A exception in JSON.parse: ' + e.toString()); console.log(body); } if (data != null) { // add 3 fields to the document data.key = key; // countryKey(data.name); data.timestamp = 'Monday, February 11, 2019 4:09:28 PM'; data.source = 'Factbook'; // insert country record var params = { TableName: 'factbook', Item: data }; console.log("Adding new item..."); docClient.put(params, function(err, data2) { if (err) { console.error("04 error inserting document. Error JSON:", JSON.stringify(err, null, 2)); context.done(err, { 'statusCode': 200, 'headers': { 'Content-Type': 'application/json' }, 'body': 'Failed to add record' }); } else { console.log("05 document inserted - source | name: " + data.source + ' | ' + data.name); context.done(null, { 'statusCode': 200, 'headers': { 'Content-Type': 'application/json' }, 'body': 'Added record ' + data.name //JSON.stringify(data) }); } }); } else { context.done(null, { 'statusCode': 200, 'headers': { 'Content-Type': 'application/json' }, 'body': 'Failed to add record due to JSON parse error ' //JSON.stringify(data) }); } }); }).on('error', function(err) { // handle errors with the request itself console.error('04 Error with the request:', err.message); callback(err); }); }; // ---- countryKey : generate a country key from a country name function countryKey(countryName) { var countryKey = countryName.toLowerCase(); countryKey = replace(countryKey, ' ', '_'); countryKey = replace(countryKey, '-', '_'); countryKey = replace(countryKey, '(', ''); countryKey = replace(countryKey, ')', ''); countryKey = replace(countryKey, ',', ''); countryKey = replace(countryKey, "'", ''); return countryKey; } function replace(value, oldChar, newChar) { if (!value) return null; return value.split(oldChar).join(newChar); }
load-country Lambda Function
When we test our function, it says all is well.
Invoking load-country
..and, we can verify that by viewing the new record added to DynamoDB in the AWS console:
Viewing added county document in DynamoDB
Lambda Functions to Access Country Data
Now that we have the World Factbook data in a DynamoDB table, we can write Lambda functions to query it.country
The first function we want to write is named country, and its purpose is simply to return an entire country document given a country name. We're writing in Node.js and developing right in the AWS console. Our function is triggered via API Gateway, so that it can instantiated with an HTTP request. We bump the memory to 512MB (the default size is too small for working with DynamoDB).
country function in AWS console
Let's review the code below to understand how it works. We declare a DocumentClient (lines 3-5), which is how we'll access DynamoDB. In line 14, we extract the expected country name in a URL query parameter called name; if for example you want the country record for Japan, you'll add ?name=Japan to the end of the URL. To retrieve the country record, we know that our partition is always "Factbook" and our sort key is the country name. To query the data, we issue a docClient.query (lines 33-46). If successful, the data is returned in the response.
exports.handler = function(event, context, callback) { const AWS = require('aws-sdk'); AWS.config.update({region: 'us-east-1'}); const docClient = new AWS.DynamoDB.DocumentClient({region: 'us-east-1'}); var corsHeaders = { "Access-Control-Allow-Origin" : "*", "Access-Control-Allow-Credentials" : true }; var countryName = null; if (event && event.queryStringParameters && event.queryStringParameters.name) countryName = event.queryStringParameters.name; if (!countryName) { callback(null, { statusCode: 200, headers: corsHeaders, body: 'Missing parameter: name' }); } var params = { TableName: 'factbook', ExpressionAttributeNames: { '#name': 'name', '#source': 'source' }, ExpressionAttributeValues: { ':name': countryName, ':source': 'Factbook' }, KeyConditionExpression: '#name = :name and #source = :source', }; docClient.query(params, function(err, data) { if (err) { console.log('03 err:') console.log(err.toString()); callback(err, { statusCode: 500, headers: corsHeaders, body: 'Error: ${err}' }); } else { if (!data || data.Items.length===0) { callback(null, { statusCode: 400, headers: corsHeaders, body: 'Country not found: ' + countryName }); } else { callback(null, { headers: corsHeaders, body: JSON.stringify(data.Items[0]) }); } } }); };
country function source code (Node.js)
name = :name and source = :source
...except that name and source are both DynamoDB reserved words. To get around that, we use #name and #source, and define those in the ExpressionAttributeNames parameter (lines 22-25). Our query then ends up being this:
KeyConditionExpression: '#name = :name and #source = :source'
If you haven't worked with DynamoDB before, the :name and :source may be unfamiliar. These are parameters that get replaced by values in the ExpressionAttributeValues parameter (lines 26-29).
If the query is successful, we return the entire result. Here's what it's like to invoke country from a browser (note: I have the JSONView Chrome Extension installed which nicely formats the JSON):
Invoking country function from a browser
people
The country function is great, but it's a big blast of data. Perhaps we're interested in a smaller part of the whole. The country JSON has subsections named introduction, geography, people, government, economy, and so on. Let's create a people function to return just the people section.The only area of people that's different from country is the query parameters: we've added a ProjectionExpression that limits the results to the people section of the document.
var params = { TableName: 'factbook', ExpressionAttributeNames: { '#name': 'name', '#source': 'source' }, ExpressionAttributeValues: { ':name': countryName, ':source': 'Factbook' }, KeyConditionExpression: '#name = :name and #source = :source', ProjectionExpression: 'people' }; docClient.query(params, function(err, data) { if(err) { console.log('03 err:') console.log(err.toString()); callback(null, { statusCode: 400, body: 'Country not found: ' + countryName }); } else { if (!data || data.Items.length===0) { callback(err, { body: null }); } else { callback(null, { body: JSON.stringify(data.Items[0].people) }); } } });
Code in people that's different from country
Here's the result of running people in a browser. Now we're dealing with a much smaller section of the country JSON.
Invoking people function from a browser
population
Let's consider one other example. What if we only need to retrieve a single field from the JSON document, such as population? population lives under people.population.total in the country JSON. Here we can again modify the ProjectExpression, but this time we'll use dotted notation to indicate a path through the document. Once again though we have to deal with the fact that total is a DynamoDB reserved word. We can resolve that with another #attributename shortcut. Here''s what our parameter code ends up looking like:
var params = { TableName: 'factbook', ExpressionAttributeNames: { '#name': 'name', '#source': 'source', '#tot': 'total' }, ExpressionAttributeValues: { ':name': countryName, ':source': 'Factbook' }, KeyConditionExpression: '#name = :name and #source = :source', ProjectionExpression: 'people.population.#tot' };
people function parameter code
{
"body": "{\"people\":{\"population\":{\"total\":329256465}}}"
}
To shorten the result to just be the value, we can change our callback as follows to bypass the containing people and population objects.
callback(null, { body: JSON.stringify(data.Items[0].people.population) });
Now the result is:
{
"total": 329256465
}
Any time we want to return just a scalar value, we can use this technique of a dotted document path in a ProjectionExpression.
In Conclusion
Today in Part 1 we brought public-domain CIA World Factbook data into AWS, storing country records in DynamoDB and image/JSON files in S3 storage. We used a Lambda function to read JSON files from S3 and inject them as documents into our DynamoDB table. Working with DynamoDB from JavaScript was fast and easy. We did have to learn how to work around a few caveats, including empty strings not permitted in the document data and how to deal with reserved words in queries.We then created Lambda functions to get at the data. We saw that we could return an entire large country JSON, or a subsection of it, or just a discrete individual property. Once we had functions at each of these levels of data, creating derivates for other sections or properties was trivial. Developing Lambda functions, editing and testing right in the AWS console, was also a quick and painless experience. We did have to be careful to adhere to proper JavaScript coding patterns for asynchronous methods such as the use of promises.
We now have our data in place and a means to access it. Now that we've laid this groundword, we'll go on in Parts 2 and 3 to create web and voice interfaces so users can work with the data. Stay Tuned!
Next: Part 2: Front-end API & Web Site using Lambda Functions and DynamoDB
No comments:
Post a Comment