Mario Zechner (@badlogic@mastodon.gamedev.place)
mastodon.gamedev.place
external-link
Today was ... interesting. If you followed me for the past months over on the shitbird site, you might have seen a bunch of angry German words, lots of graphs, and the occassional news paper, radio, or TV snippet with yours truely. Let me explain. In Austria, inflation is way above the EU average. There's no end in sight. This is especially true for basic needs like energy and food. Our government stated in May that they'd build a food price database together with the big grocery chains. But..

It would be nice to be able to bring to light the price gouging that is taking place in Canada with regards to grocery stores.

The project is Open source, so you might be able to leverage it for Canadian data. All you need is:

  • Understanding of the expected format for the project
  • Access to data from Canadian retailers. This can be acquired via APIs (these are usually free) or by scraping their sites.

At the bottom of the chain on mastodon the creator says they use the search APIs of the store websites. I wouldn’t have expected those to be easily accessible!

Daniel Quinn
link
fedilink
41Y

Yeah a lot of chains even have a documented, developer-friendly API. If that’s not available though, you can usually figure out the API just by looking at the calls your browser makes when visiting a page. Most sites use a REST API for catalog pages that’s then rendered out with JavaScript.

If that doesn’t work, then you can usually scrape everything with Selenium. It’s a little harder to do, but still quite manageable, though that usually has to be a background job, as it’s slow.

Can’t, this would be illegal within a year. Scraping data is already taboo. How fucking dumb is that.

Its why I hate the ’ starving artist worried about AI scrapping’ stories. It will be used to usher in stronger laws to prevent us from scraping this data. Its a double edged sword.

Otter
mod
link
fedilink
English
31Y

If you do get started with this, I’d love to follow along and find a place where I can help. If you guys make a community or mastodon account for example, please link it :)

Would a system identifying products from a recipt work for this? combined with other data sources (like web scraping) it would make it a lot easier to crowdsource the data, even if only sortaa technically inclined people do it

@nyan@lemmy.cafe
link
fedilink
English
41Y

It would help to some extent, but to really get people to buy in you’d need an app to do the heavy lifting (that is, it’s easier to get people to snap a photo of their receipt than to type the info in one character at a time). Some people might still be willing to do it without, but how many?

You’d also have to relate the abbreviations that often appear on grocery receipts back to the items they represent, which is more data entry.

Yeah, I was thinking something along the lines of lots n lots of easy shitty data (ex. anyone who can take a picture), some pretty good data (ex. hand labeled receipts), some 100% reliable data (scraped/api) then some sort of system to correlate the 3, especially when prices match identically between receipt and api a fair sized database could create itself.

Also would need some sort of processing center to handle the many image processing requests, but maybe that could be done client side

Would stores attempt to hire teams to put in junk data

Not exactly what you’re looking for, but check out this Marketplace piece:

https://www.cbc.ca/news/business/marketplace-shrinkflation-1.6654780

Rentlar
link
fedilink
71Y

With the advent of e-ink price tags, it wouldn’t be surprising to me if something similar goes on with the prices of generic and medium tier items.

We’d have to see if an API is available somewhere first.

Huh. Now you’ve got me wondering… Could you leave a device hidden in store that receives the IR signals that program the tags, capture that information, then parse it out later? You could literally log prices changes at the shelves, in real time.

I don’t think they’re IR. Something similar to wifi. Surely encrypted.

Looks like there are two varieties - IR or Bluetooth Low-Energy (BLE).

Even without an API it should be possible, in theory, to just parse the data directly from their websites.

This also gives the grocery stores less of a leg to stand on in terms of legal or practical recourse. They chose to create a publicly browsable database of their prices; all you’re doing is browsing it.

I know of one in Alberta.

Grocery Tracker

Nice find! This looks like exactly it, but Canadian.

Are we looking to expand what’s here on the Grocery Tracker to incorporate what they are doing with the Austrian site?

I’d also like to look at other pinch points of government heel dragging. Housing, energy, medical, transportation, telecom, news etc. We all see these government contracts go out for seven figures and it’s always shown to be blown out of proportion.

A nice added bonus to the project in Austria was someone giving historical data. It would be great to have a similar leg up for Canada.

@nyan@lemmy.cafe
link
fedilink
English
151Y

The issue with this sort of thing is primarily one of data entry, rather than “tech savvy” as such. Defining the database is easy compared to getting the data in there.

Quick options would include parsing the information out of the stores’ websites (possible, but if Javascript is involved you may be looking at puppeting a browser with Selenium, which isn’t fast and can get tedious, and the approach depends on the websites being complete, accurate, and up-to-date), or hacking or snooping on the stores’ own mobile apps (if they have them) to get price information in a usable format. Approaches like this are inherantly brittle, as even trivial changes made from the grocery chains’ end can cause them to break. Scraping information without a defined API or the cooperation of the owner of the data is a moving target. From experience, I can tell you that it gets annoying fast.

In the case of the Austrian government, they probably wanted that cooperation and defined API. Which would have required careful negotiations with each company and paid programmers looking at the corporate databases. That would have increased their cost and lengthened their projected timeframe. Corruption and corporate greed did the rest.

They use the search APIs of the grocery store websites.

So essentially it was possible over there due to proper/favourable conditions, whereas here it would be much more difficult?

elmicha
link
fedilink
61Y

the responsible minister claimed it’s an immense task and will take til autumn. It will  only include 16 product categories (think flour, milk,etc.). And it will only be updated once a week.

I mean that’s pretty pathetic. Better than nothing, but “only updated once a week” sounds like “the intern who has to enter the prices works only for 20 hours”, not like they created an API and told the grocery chains to upload their prices.

@nyan@lemmy.cafe
link
fedilink
English
31Y

Unknown. I don’t use the grocery chains’ websites (I’m of the “go to the nearest physical store and figure it out once there” persuasion), so I don’t know what the complexity level would be. It’s possible that they’re all older-school sites where you can lift the data straight from the HTML, which is relatively fast.

Pretty easy to get something basic set up if you get enough people to crowd-source data with photos of stuff in grocery stores and their receipts, along with some scraping to get data that’s available online. It’s a project that’s been on my backlog for a while, but I can bump it up if others want to join me in making this.

deleted by creator

Create a post

What’s going on Canada?



Communities


🍁 Meta

🗺️ Provinces / Territories

🏙️ Cities / Local Communities

🏒 Sports

Hockey

Football (NFL)

  • List of All Teams: unknown

Football (CFL)

  • List of All Teams: unknown

Baseball

Basketball

Soccer


💻 Universities

💵 Finance / Shopping

🗣️ Politics

🍁 Social and Culture

Rules

Reminder that the rules for lemmy.ca also apply here. See the sidebar on the homepage:

https://lemmy.ca


  • 1 user online
  • 119 users / day
  • 251 users / week
  • 526 users / month
  • 1.99K users / 6 months
  • 1 subscriber
  • 5.7K Posts
  • 50.8K Comments
  • Modlog