Foursquare's 104M Points of Interest
72 points by marklit 2 days ago | 27 comments
  • tech234a 16 hours ago |
    Article has a minor typo: it reads "The US has ~23.5M records followed by Indonesia and Turkey with over 80M each" but the second figure should be 8M.
    • marklit 15 hours ago |
      Fixed. Thanks for spotting that.
  • tra3 16 hours ago |
    Why would foursquare release this dataset? I can't help but try to think of an angle..
    • martinkallstrom 15 hours ago |
      The angle I can think of is to preserve the legacy of a dwindling operation and share the value that was created at the peak.
    • jsemrau 15 hours ago |
      90% of Foursquares revenue comes from Enterprise clients. This dataset would not cannibalize that revenue, but it would provide the general population by fixing bugs and finding new use-cases that might put them in a better spot when competing with Google Maps, Yelp, and Facebook Places.
    • billfor 14 hours ago |
      Most of the stuff in that dataset has apis you can use live. They sent notification that they were turning off citysearch towards the end of this year and beginning of next year. The api behind citysearch was the only way I know that an individual could keep a categorized list of places like bars and restaurants under control. I would take their api and convert it to kml to build my own google map of places without all the google ad crap. As well as having a full featured api it would also mark places as closed so your lists could autoremove the old places and you could sometimes find what took over the location. I would also subcategorize places into favorites, new and notable, hidden places, types of bars, etc….

      I will miss Foursquare citysearch and its predecessor, a little palmos app known as Vindigo. Google and yelp let you tag places in their apps but don’t have as good of api, so going forward it will be hard to maintain a private list of places that can be categorized, rendered, filtered, maintained, and exported. Google and yelp largely keep your poi info captive.

      • marklit 14 hours ago |
        To anyone considering going to all this effort, consider doing this work on OpenStreetMap. 50K contributors make OSM a bit better every month but a good map is never finished. https://rapideditor.org/edit
        • timeon 14 hours ago |
          And for accessing API there is also web interface: https://overpass-turbo.eu/
        • RicoElectrico 6 hours ago |
          Um, RapID is the big tech's spin on OSM editing. Clicking stuff to import from AI or government data is hardly the essence of OSM, a complement at best.

          Use iD or JOSM on desktop, StreetComplete/Every Door/Go Map!!/Vespucci on the phone. Survey POIs in your local area, with your own feet. Big tech can't do that ;)

  • eichin 14 hours ago |
    Ooh, I have an "all ice cream shops in Massachusetts" project for which this would be at least an interesting cross-reference for (the "places humans show up at" bias in foursquare's business works in my favor here.)

    Or rather, their former business? https://techcrunch.com/2024/10/22/farewell-to-foursquares-ap... says the user apps go away in less than a month...

    • snthd 10 hours ago |
      Here's an overpass turbo query against OpenStreetMap, if anyone was curious:

      https://overpass-turbo.eu/s/1UCX

          [out:json][timeout:25];
          {{geocodeArea:Massachusetts}}->.searchArea;
          (
          nwr["amenity"="ice_cream"](area.searchArea);
          nwr["shop"="ice_cream"](area.searchArea);
          );
            out geom;
      • eichin 6 hours ago |
        Neat! I'll have to poke at it and see if I can come up with a usefully broader-but-not-too-noisy search - my personal "obsessively search 'ice cream in $town' for each town" collection has about twice that many individually reviewed locations and I'm nowhere near done collecting. (My "bracketing shot" for this is that mailing list vendors claim they can sell me a list of 700ish ice cream related business addresses - no idea how precise they are! but it suggests my current 370-item list is "getting there".)
  • qwertox 14 hours ago |
    What's the licensing of this? I wonder if it could be used to improve OSM.

    --

    To answer my own question

    > This base layer of 100mm+ global places of interest ("POI") includes 22 core attributes (see schema here) that will be updated monthly and available for commercial use under the Apache 2.0 license framework.

    Found on Simon Willison’s Weblog [0], quoting the official announcement [1]. His page also shows how to use it with Datasette.

    [0] https://simonwillison.net/2024/Nov/20/foursquare-open-source...

    [1] https://location.foursquare.com/resources/blog/products/four...

    • sp8962 13 hours ago |
      • qwertox 13 hours ago |
        Thank you.

        There's an interesting link in that thread to a PMTiles viewer with the data in it:

        https://wipfli.github.io/foursquare-os-places-pmtiles/#map=1...

        • eichin 6 hours ago |
          That's really convenient - I zoomed the map to an area in my town, clicked on a place, and even though popup is just raw data, it let me see which fields hold which values (which I could then feed back into pandas search expressions on the parquet files.) Just little things like "locality" is city/town in the US, and "fsq_category_labels is where I'll find Ice Cream Parlor".
      • walterbell 11 hours ago |
        Thanks.

        > Foursquare and Overture places are like many geolocation-centric datasets: users aren’t supposed to ever see the raw data, either in a list or on the map. You have to filter by a confidence score. Otherwise, you’ll get tons of user-generated junk – pranks, mistakes, etc.. In the past, Foursquare would charge big bucks for the confidence scores as an upsell. If these scores aren’t part of the dataset, then no wonder the company feels comfortable releasing the data.

  • tipiirai 14 hours ago |
    I thought Foursquare no longer exists
    • xeromal 11 hours ago |
      They pivoted.
  • junto 14 hours ago |
    These are mostly way out of date now right?
    • dzogchen 12 hours ago |
      No. Each POI has date properties indicating date added, date closed and last check date. Most POIs are surprisingly up to date.

      See my other comment to explore the dataset.

      • jorams 10 hours ago |
        A comment on the OSM community thread notes, and I can confirm based on the map you linked, that it contains many POIs that used to exist but haven't for a while, which nevertheless have a date_refreshed from this year.
  • dzogchen 12 hours ago |
    Someone already packaged them up in PMTiles format, if you want to explore the dataset (uses MapLibre): https://wipfli.github.io/foursquare-os-places-pmtiles/

    https://github.com/wipfli/foursquare-os-places-pmtiles

  • wslh 11 hours ago |
    Am I mistaken, or are we now at a data inflection point? As a frustrated consumer of POIs (e.g. Google Maps), I suspect that Foursquare understands their real position of power is no longer in the data itself (since many businesses are now doing the same) but in owning the last mile of the user experience. From a business perspective, we can create countless sites using this data, but that alone won’t significantly move the needle.
  • BrandiATMuhkuh 10 hours ago |
    About 10 years ago there was a project called "Sightsmap". It was a heatmap of the most photographed sights in the world.

    I really loved the map for planning road trips and city trips.

    I would love such a service again. I think OPs data/maps represent basically the same information.

  • FollowingTheDao 10 hours ago |
    I started reading the article but stopped out of sheer jealousy of that guys PC rig:

    "I'm using a 6 GHz Intel Core i9-14900K CPU. It has 8 performance cores and 16 efficiency cores with a total of 32 threads and 32 MB of L2 cache. It has a liquid cooler attached and is housed in a spacious, full-sized, Cooler Master HAF 700 computer case. I've come across videos on YouTube where people have managed to overclock the i9-14900KF to 9.1 GHz.

    The system has 96 GB of DDR5 RAM clocked at 6,000 MT/s and a 5th-generation, Crucial T700 4 TB NVMe M.2 SSD which can read at speeds up to 12,400 MB/s. There is a heatsink on the SSD to help keep its temperature down. This is my system's C drive."

    • eichin 5 hours ago |
      That does sound glorious, but I didn't have any trouble loading individual parquet files into pandas on a 3 year old Thinkpad X1...