Showing posts with label PostGIS. Show all posts
Showing posts with label PostGIS. Show all posts

Monday, June 30, 2025

Take the 40 Bus Route Out Route 40

 After getting feedback, advice, and counsel about my recent public transport mapping updates in OpenStreetMaps, I started adding the missing "40" bus route to the Baltimore Metro Area. Part of the planning was whether to include bus stops and stopping locations, part was identifying errors and omissions in the OSM base, and part was being able to visualize current routes with OSM data.


Then a fragment:



I can see the 40 on existing bus stop signs:




This stop includes the OR (Orange), the 40, 59, 62, and 160 routes. When I looked, the OSM data for stop 7597 showed '59;62;160;OR' while the MTA showed '59;62;160;OR', as if the bus went to the Essex Park + Ride on the other side of the street instead of this stop.


So, I am unsure what the right data should be, until I ride the bus, or observe at the stop.

The guidance for adding a new route ("Step 1 - Make sure that each bus stop in the route has been added to OpenStreetMap") said check all the nodes. Given the above discrepancy, I hope to add at least one-way next, as small fixes could happen after, like when the routes/schedules change.

The last overview map update page from MTA [ mta.maryland.gov/transit-maps ] is from 2022, and the 40 route is not on the "BaltimoreLink Interactive System Map". That's a map from Google rather than hosted by the government. However, extrapolating shows this page is available in 2025:


Methodology

For myself, and others who may wish to get some GIS mojo working, general steps I've taken to gather data, analyze it, and visualize in various ways to find improvements to OSM content. The earlier post [ https://jspath55.blogspot.com/2025/06/which-bus-such-bus.html ] covers some of this; apologies for repeating myself.

MTA Data

The Maryland Department of Transportation includes the Mass Transit Administration. In addition to the schedule site listed above, there is a GIS portal with bus stops and routes, though I've not gotten much out of the latter.


Home > services > Transportation > MD_Transit (FeatureServer)


Then:


And end up here:



GIS ARC

Part 2 (right side)


The way to pull the data is via:

https://geodata.md.gov/imap/rest/services/Transportation/MD_Transit/FeatureServer/9/query

https://geodata.md.gov/imap/rest/services/Transportation/MD_Transit/FeatureServer/9/query?where=stop_id+is+NOT+NULL ...


"All" query: stop_id is NOT NULL

Output fields: stop_id, stop_name, Routes_Served, shelter

Results in a linear format:

# records: 4549

stop_name: Cylburn Ave & Greenspring Ave
Routes_Served: 31, 91, 94
Shelter: Yes
stop_id: 1
Point:
X: -8533795.9128
Y: 4772066.8330999985

stop_name: Lanier Ave & Sinai Hospital
Routes_Served: 31, 91, 94
Shelter: No
stop_id: 3
Point:
X: -8534126.0864
Y: 4772153.208300002

[...]

I parsed this into a CSV format to load into QGIS. The wrinkle was finding how to deal with the X Y coordinates returned rather than the expected latitude/longitude pairs. QGIS was happy when I set the coordinate reference system (CRS) to "ESRI:102100 - WGS_1984_Web_Mercator_Auxiliary_Sphere."

OSM data

I used OverPass-Turbo.eu to get extracts of bus stopping locations and bus stops, then separate them.

[out:csv("ref", "public_transport", "name", "route_ref", "highway", "network", "operator", ::lat, ::lon, ::id)];

(
  {{geocodeArea:"Maryland"}};

  )->.metroArea;

node(area)[bus=yes];
out;

I used "Maryland" rather than "Baltimore" to pick up Anne Arundel County and other nearby locales, though this expanded the return to include eastern and western counties as well as DC Metro. Better too much data than not enough here.

The first tries did not include position or OSM node ID; eventually I found the special names needed to get theses critical columns. After splitting into 2 files to allow QGIS layers for each, over 4,000 bus stops found though under 1,000 stopping locations.

$ wc -l OSM_platforms_20250623.csv OSM_stops_20250623.csv
  4240 OSM_platforms_20250623.csv
   928 OSM_stops_20250623.csv
  5168 total

Sample export:



Add these 2 layers, plus the MTA extract, and start the reconciliation.

Things found:

  • MTA and OSM sequence the routes differently, and the MTA records appear in yet another order compared to bus stop signs. I wrote a Perl script to convert the MTA data into the order OSM expects to make it easier to update wrong data.
  • The bus stop names differ in upper case/lower case, and various abbreviations ("nb" versus "northbound", for example.
  • Typos in both OSM and MTA data conglomerating 2 or more routes to one number ("78150" instead of "78, 150" or "78;150".
  • Null data nodes: apparent OSM bus stops without details (e.g. openstreetmap.org/node/2743861086 ). These are presumably orphans, though they may be incomplete nodes that should remain. If no MTA stops nearby, fix me candidates.
  • Typos of other sorts, missing stops, incomplete or duplicate descriptions.

Things fixed:

I alternated trying to simplify the updates to be made by only looking at platforms, but decided that version 2 with a stopping location paired with the platform is the preferred way. I created a spreadsheet to have a local index of the 40 bus routes in either direction. The columns included nodes for each type, stop numbers and names, and an alternate name per the official signage ("Eastpoint" as the area for Eastern & 54th, east or west bound).

Only a handful of bus stop areas include a stopping location, so that is my current task. After that, on to steps 2 and 3:

Step 2 - Create the new bus relations
Step 3 - Create a route master relation


The QGIS map view, overlaying MTA, OSM stops/platforms:



By migrating these layers into a PostGIS/PostgreSQL database (using a CRS under 10,000), I can now write SQL queries with outer joins and other comparison techniques.

Stops in MTA but not OSM:

select
  ref,
  substr(trim(route_ref),1,8) as routes,
  trim(name) as name
from
  "OSM_platforms_20250623"
where
  cast(ref as text) NOT in (
    select
      field_4
    from
      "bus-stops-20250611"
  )
;
[...]
   14166 | 30;34    | Northern Parkway & Park Heights Avenue Westbound Far-side
   14215 |          | Tradepoint Atlantic Floor & Decor
   14216 |          | Tradepoint Atlantic Royal Farms
  190036 |          | Baltimore Greyhound Terminal
 3002008 |          | Good Luck Rd at Oakland Ave
 3003202 |          | Good Luck Rd at Parkdale High School
 3003205 |          | Good Luck Rd at Oakland Ave
(133 rows)


Stops in MTA but not OSM:

select
      upper(substr(mta.field_1,1,40)) as nm,
      substr(mta.field_2,1,32) as f2,
      mta.field_4,
      osm.ref
from
      "bus-stops-20250611" mta
left outer join
      "osm_platforms_20250623_md" osm on mta.field_4 = cast(osm.ref as text)
where
  osm.ref is NULL
or
  cast(osm.ref as float) <= 0
order by
  cast(mta.field_4 as float)
;

This one included DC Metro areas, so needs more tuning to be useful.

Show mismatched route lists per stop:

select
      upper(substr(mta.name,1,40)) as nm,
      substr(mta.route_osm_style,1,32) as mta_route_osm,
      substr(osm.route_ref,1,32) as osm_route,
      mta.stop,
      osm.ref
from
      "bus-stops-20250611+0626" mta
left outer join
      "osm_platforms_20250623_md" osm on cast(mta.stop as text)  = osm.ref
where
  (osm.ref is NULL or cast(osm.ref as text) >= '0')
and
  mta.route_osm_style != osm.route_ref
order by
  mta.stop
;

The later version of the MTA list includes the reordered route list, as comparing the strings in SQL was not an obvious approach.  Example results:

[...]
      upper(substr(mta.name,1,40)) as nm,
      substr(mta.route_osm_style,1,32) as mta_route_osm,
      substr(osm.route_ref,1,32) as osm_route,
      mta.stop
[...]
LOCH RAVEN BLVD & STATE HWY DR SB 53 53;103;104 422
LOCH RAVEN BLVD & SAYWARD AVE SB 53;GR 53;103;104;GR 435
SAINT PAUL ST & 31ST ST SB 51;95;SV 95;SV 482
SAINT PAUL ST & 29TH ST SB 51;95;SV 95;SV 483
CHARLES ST & REDWOOD ST NB 51;65;67;95;103;410;411;420;GR;S 51;65;67;95;103;164;410;411;420; 506
SAINT PAUL ST & FAYETTE ST FS SB 67;76;95;103;120;210;215;310;410 67;76;95;103;120;164;210;215;310 516
CHARLES ST & HYATT NB 51;67;94;95;103;SV 51;67;94;95;103;103;164;SV 518
CHARLES ST & FAYETTE ST NB 51;95;103;410;411;420;GR;SV 51;95;103;410;411;420;425;GR;SV 519
CHARLES ST & READ ST NB 51;95;103;GR;SV 51;95;103;410;GR;SV 525

I photographed the bus stop sign at Charles and Read Streets (northbound) only yesterday, confirming the list MTA had. And updated the node: openstreetmap.org/node/6254842377

Charles + Read Streets Baltimore

U.S. Route 40, as opposed to MTA Bus Route 40, is the "National Pike"; see also wikipedia.org/wiki/Old_National_Pike.



Wednesday, April 9, 2025

Food Service Mapping

Food banks run by charitable organizations are a public service where private entities fill in where government runs short. Picking up, storing, and sharing food throws lifelines to those in need. I volunteered to create updated maps for Scouting America and decided to use QGis and related software tools.

(1) OSM

Going from the lowest level up, I added OpenStreetMaps (OSM). This is an easy drag-and-drop from the base QGIS sources into the current project. Depending on the desired result, I change the transparency from none to 50% more or less.

The standard OSM layer sources from openstreetmap.org. I've found, using tools like Viking, that variations on the main source can be used, and I prefer to try the Humanitarian one. Eventually I found the wiki page which let me set the feed, like this:

https://wiki.openstreetmap.org/wiki/Raster_tile_providers

https://a.tile.openstreetmap.fr/hot/{z}/{x}/{y}.png

(2) Org

The second layer is the organization table. These have street addresses in the source, so I geo-located them with a Census Bureau tool that works via command line.

Once located, the QGIS data are converted to geometry. In a PostgreSQL database, I imported organization details, which had only street addresses not geo-data. To convert from the address to a lat/lon point I found a Census page that I could script.


$ lynx -dump "https://geocoding.geo.census.gov/geocoder/locations/address?street=10001%20Bird%20River%20Rd&state=md&zip=21220&benchmark=2020"  | grep Interpolate
   Interpolated Longitude (X) Coordinates: -76.432431362395
   Interpolated Latitude (Y) Coordinates: 39.356117523642

Ran the conversion steps and viewed points and polygons.  Once the latitude and longitude are set, the PostGIS function to make this into valid point(s) I used is:

UPDATE org.org SET geom = ST_SetSRID(ST_MakePoint(lng, lat), 4326);

(3) Tract shapes

Next level layer shows US Census tracts. I found a couple sources of these shape files, and used the Maryland State data.

https://www.census.gov/cgi-bin/geo/shapefiles/index.php?year=2020&layergroup=Census+Tracts


[omitting how to add this layer from ZIP or shape files]

To match the boundaries of the local district I wrote a filter, first including only Baltimore County tracts and next excluding specific tracts not in scope.

QGIS lets me save and load a filter definition, more or less a SQL code snippet.

<Query>"COUNTYFP"='005' and "tractce"  NOT IN (
        '400100',
        '400200',
        '400400',
[...]
        '494201',
        '494202',
        '980000',
        '980100',
        '980200',

        '9999999999'
)
</Query>

The blank line prior to the query end lets me run a unique sort to more quickly find gaps and add or subtract tracts.

(4) Unit Tracts Coverage

The coverage mapping uses a simple table with tract and unit/organizations. In order to connect these to the tracts I created a view.

Tuesday, February 20, 2024

Well Sample Data and GIS Development

 I looked at water quality tests for a few private wells in Maryland to be able to better use GIS tools. After reviewing public information I took note of key fields such as well depth. In order to show connections among the various aspects I started an entity relationship diagram, going from basic SQL to end up with 8 tables.


In order to keep consistent a few constraints were included, such as:


ALTER TABLE

  well

ADD CONSTRAINT

  well_fkey_property

FOREIGN KEY

  (property_fk)

REFERENCES

  property(name)

;


  I decided to connect wells to a property, and buildings likewise. Normally one well is allowed per property, though while a replacement is being drilled one property might have 2 wells, and if a well runs dry you might say the property has no well.

 The old forms (pre-2000?) have Maryland grid coordinates, with accuracy to 1,000 feet, while the newer forms have a place for latitude/longitude. Water quality tests done by commercial lab contain values of constituents ranging from coliform bacteria to lead and arsenic.

 Sometimes the applications have typos.




Oops, mixed up east and north. I did the same when I first tried to load point data into GIS; once I noticed the drift and switched the values the wellhead came online where it should be.

 After the table definitions and test data load I used the QGis example tutorials to include the bits required for my hand-curated data to be mapped correctly.

See: 
https://docs.qgis.org/3.28/en/docs/training_manual/spatial_databases/simple_feature_model.html
https://docs.qgis.org/3.28/en/docs/training_manual/spatial_databases/geometry.html

> insert into geometry_columns values ('','public','well','the_geometry',2,4326,'POINT');

> update well set the_geometry = 'SRID=4326;POINT(-75.882996 38.938629)' where name = 'CO-15-0020' ;

 






An 8-way join, or is 7-way?

select
  r.value ,                             -- value
  r.uom,                                -- units
  p.name as name_p,                     -- param
  substr(r.name,1,12) as name_r,        -- result
  s.well_fk as well,                    -- well
  s.name as name_s,                     -- sample
  substr(s.sample_place,1,4) as place,  --
  substr(p.descr,1,10) as param_desc,   --
  substr(c.name,1,8) as name_c,         -- corp
  w.depth,                              -- well
  w.pump_installed as pump,             -- well
  substr(o.lot,1,10) as lot,            -- prop
  e.name as person,                     -- people
  s.sample_date
from
  people e,
  building b,
  lab_corp c,
  lab_sample s,
  lab_sample_results r,
  parameter p,
  property o,
  well w
where
  e.building_fk = b.name     -- people in building
and
  b.property_fk = o.name     -- building on property
and
  s.well_fk = w.name         -- sample from well
and
  w.property_fk = o.name     -- well on property
and
  r.parameter_fk = p.name
and
  r.lab_sample_fk = s.name
and
  s.lab_corp_fk = c.name
;


What have I accomplished? Now I have environmental data in a reachable database, which I can edit via command line, or even using LibreOffice forms, and now view and update via QGis.
Next I will figure out the step up from points data (the well heads) to polygons (property lines and building walls). Then, with sample data, set up views with contaminant levels on the map.



Looks a lot like dBASE IV, right?