Trying a switch to tal@lemmy.today, at least for a while, due to recent kbin.social stability problems and to help spread load.

  • 1 Post
  • 40 Comments
Joined 1Y ago
cake
Cake day: Jun 13, 2023

help-circle
rss

AGI is not a new term. It’s been in use since the 90s and the concept has been around for much longer.

It’s not new today, but it post-dates “AI” and hit the same problem then.


Reddit had the ability to have a per-subreddit wiki. I never dug into it on the moderator side, but it was useful for some things like setting up pages with subreddit rules and the like. I think that moderators had some level of control over it, at least to allow non-moderator edits or not, maybe on a per-page basis.

That could be a useful option for communities; I think that in general, there is more utility for per-community than per-instance wiki spaces, though I know that you admin a server with one major community which you also moderate, so in your case, there may not be much difference.

I don’t know how amenable django-wiki is to partitioning things up like that, though.

EDIT: https://www.reddit.com/wiki/wiki/ has a brief summary.


I broadly agree that “cloud” has an awful lot of marketing fluff to it, as with many previous buzzwords in information technology.

However, I also think that there was legitimately a shift from a point in time where one got a physical box assigned to them to the point where VPSes started being a thing to something like AWS. A user really did become increasingly-decoupled from the actual physical hardware.

With a physical server, I care about the actual physical aspects of the machine.

With a VPS, I still have “a VPS”. It’s virtualized, yeah, but I don’t normally deal with them dynamically.

With something like AWS, I’m thinking more in terms of spinning up and spinning down instances when needed.

I think that it’s reasonable to want to describe that increasing abstraction in some way.

Is it a fundamental game-changer? In general, I don’t think so. But was there a shift? Yeah, I think so.

And there might legitimately be some companies for which that is a game-changer, where the cost-efficiencies of being able to scale up dynamically to handle peak load on a service are so important that it permits their service to be viable at all.


I mean, scrolling down that list, those all make sense.

I’m not arguing that Google should have kept them going.

But I think that it might be fair to say that Google did start a number of projects and then cancel them – even if sensibly – and that for people who start to rely on them, that’s frustrating.

In some cases, like with Google Labs stuff, it was very explicit that anything there was experimental and not something that Google was committing to. If one relied on it, well, that’s kind of their fault.


Nah, because there are definitely projects Google started on there. The one OP mentioned is on there, and I remember Google Zeitgeist from back when.

EDIT: Not saying that this is comprehensive, but only five entries reference being acquired from elsewhere in their description.


I bet someone has made a list.

googles

Yup.

The first item on the field is a search field. The “all” category has 288 entries.

https://killedbygoogle.com/


https://www.cato.org/policy-report/january/february-2017/megaprojects-over-budget-over-time-over-over#

THE IRON LAW OF MEGAPROJECTS

Performance data for megaprojects speak their own language. Nine out of ten such projects have cost overruns. Overruns of up to 50 percent in real terms are common, over 50 percent not uncommon. Cost overrun for the Channel Tunnel, the longest underwater rail tunnel in Europe, connecting the UK and France, was 80 percent in real terms. For Boston’s Big Dig, 220 percent. The Sydney Opera House, 1,400 percent. Similarly, benefit shortfalls of up to 50 percent are also common, and above 50 percent not uncommon.

One may argue, of course, as was famously done by Albert Hirschman, that if people knew in advance the real costs and challenges involved in delivering a large project, nothing would ever get built — so it is better not to know, because ignorance helps get projects started. A particularly candid articulation of the nothing‐​would‐​ever‐​get‐​built argument came from former California State Assembly speaker and mayor of San Francisco Willie Brown, discussing a large cost overrun on the San Francisco Transbay Terminal megaproject in his San Francisco Chronicle column:

News that the Transbay Terminal is something like $300 million over budget should not come as a shock to anyone. We always knew the initial estimate was way under the real cost. Just like we never had a real cost for the [San Francisco] Central Subway or the [San Francisco‐​Oakland] Bay Bridge or any other massive construction project. So get off it. In the world of civic projects, the first budget is really just a down payment. If people knew the real cost from the start, nothing would ever be approved. The idea is to get going. Start digging a hole and make it so big, there’s no alternative to coming up with the money to fill it in [emphasis added].

Rarely has the tactical use by project advocates of cost underestimation, sunk costs, and lock‐​in to get projects started been expressed by an insider more plainly, if somewhat cynically.

Maybe there needs to be the introduction of new mechanisms to deal with assessing the cost of very large projects.


Yeah, I ran into this on /r/europe when there were some EU legislation issues. The EFF does have some activity in the EU, but it does have a mostly-US focus, and there isn’t really a direct analog.

It depends on what your interest is.

EDRi (European Digital Rights) in Europe has come up on a couple of advocacy issues I’ve followed. If you’re in Europe, they might be worth a look. They don’t feel quite the same to me, but maybe that’s what you’re looking for.


My understanding is that there had been an ongoing concern on /r/piracy that they would get shut down at some point, that this had been a concern in the past, and so the other stuff like the API restrictions and the rest of the spez drama was kind of just adding to the big factor pushing people away – that the community could vanish at any time.

The lead mod on /r/piracy also set up a dedicated instance – there was definite commitment – made it clear that he was making the move, and was demodded on /r/piracy, so there were factors creating more inertia.

Those are all factors that did not generally exist for other communities.


What’s been your experience with youtube recommendations?

I’ve never had a YouTube account, so YouTube doesn’t have any persistent data on me as an individual to do recommendations unless it can infer who I am from other data.

They seem to do a decent job of recommending the next video in a series done in a playlist by an author, which is really the only utility I get out of suggestions that YouTube gives me (outside of search results, which I suppose are themselves a form of recommendation). I’d think that YouTube could do better by just providing an easy way to get from the video to such a list, but…


I don’t like the idea of link taxes myself.

But even setting aside the question of whether link taxes are a good idea, I don’t understand why they’re making a – what to me sounds dubious – antitrust argument. It seems like a simply bizarre angle.

If the Canadian government wants news aggregators to pay a percentage of income to news companies, I would assume that they can just tax news aggregators – not per link to Canadian news source, but for operating in a market at all – take the money and then subsidize Canadian news sources. It may or may not be a good idea economically, but it seems like it’d be on considerably firmer footing than trying to use antitrust law to bludgeon news aggregators into taking actions that would trigger a link tax by aggregating Canadian news sources.


and I wondered if it wouldn’t be a good ides to somehow gather a list of small instances

https://lemmy.fediverse.observer/list

https://kbin.fediverse.observer/list

If you require open signups and then sort by number of users, ascending, that’s an auto-generated list of small instances that (presumably) are looking for users.


Yeah, I don’t think I really agree with the author as to the difficulty with dig. Maybe it could be better, but as protocols and tools go, I’d say that dig and DNS is an example where a tool does a pretty good job of coverage. Maybe not DNSSEC, dunno about how dig does there, and knowing to use +norecurse is maybe not immediately obvious, but I can list a lot of network protocols for which I wish that there were the equivalent to dig.

However, a lot of what of what the author seems to be complaining about is not really stuff at the network level, but the stuff happening on the host level. And it is true that there are a lot of parts in there if one considers name resolution as a whole, not just DNS, and no one tool that can look at the whole process.

If I’m doing a resolution with Firefox, I’ve got a browser cache for name resolutions independently of the OS. I may be doing DNS over HTTP, and that may always happen or be a fallback. I may have a caching nameserver at my OS level. There’s the /etc/hosts file. There’s configuration in /etc/resolv.conf. There’s NIS/yp. Windows has its own name resolution stuff hooked into the Windows domains stuff and several mechanisms to do name resolution, whether via broadcasts without a domain controller or with a DC whether that’s present; Apple has Bonjour and more-generally there’s zeroconf. It’s not immediately clear to someone the order of this or a tool that can monitor the whole process end to end – these are indeed independent systems that kind of grew organically.

Maybe it’d be nice to have an API to let external software initiate name resolutions via the browser and get information about what’s going on, and then have a single “name resolution diagnostic” tool that could span multiple of these name resolution systems, describe what’s happening and help highlight problems. I can say that gethostbyname() could also use a diagnostic call to extract more information about what a resolution attempt attempted to do and why it failed; libc doesn’t expose a lot of useful diagnostic information to the application, though libc does know what it is doing in a resolution attempt.


make dig’s output a little more friendly. If I were better at C programming, I might try to write a dig pull request that adds a +human flag to dig that formats the long form output in a more structured and readable way, maybe something like this:

Okay, fair enough.

One quick note on dig: newer versions of dig do have a +yaml output format which feels a little clearer to me, though it’s too verbose for my taste (a pretty simple DNS response doesn’t fit on my screen)

Man, that is like the opposite approach to what you want. If YAML output is easier to read, that’s incidental; that’s intended to be machine-readable, a stable output format.


Duplicity uses rsync internally for efficient transport. I have used that. I’m presently using rdiff-backup, driven by backupninja out of a cron job, to backup to a local hard drive and which does incremental backups (which would address @Nr97JcmjjiXZud’s concern). That also uses rsync. There’s also rsbackup, which also uses rsync and I have not used.

Two caveats I’d note that may or may not be a concern for one’s specific use case (which apply to rdiff-backup, and I believe both also apply to the other two rsync-based solutions above, though it’s been a while since I’ve looked at them, so don’t quote me on that):

  • One property that a backup system can have is to make backups immutable – so that only the backup system has the ability to purge old backups. That could be useful if, for example, the system with the data one is preserving is broken into – you may not want someone compromising the backed up system to be able to wipe the old backups. Rdiff-backup expects to be able to connect to the backup system and write to it. Unless there’s some additional layer of backups that the backup server is doing, that may be a concern for you.

  • Rdiff-backup doesn’t do dedup of data. That is, if you have a 1GB file named “A” and one byte in that file changes, it will only send over a small delta and will efficiently store that delta. But if you have another 1GB file named “B” that is identical to “A” in content, rdiff-backup won’t detect that and only use 1GB of storage – it will require 2GB and store the identical files separately. That’s not a huge concern for me, since I’m backing up a one-user system and I don’t have a lot of duplicate data stored, but for someone else’s use case, that may be important. Possibly more-importantly to OP, since this is offsite and bandwidth may be a constraining factor, the 1GB file will be retransferred. I think that this also applies to renames, though I could be wrong there (i.e. you’d get that for free with dedup; I don’t think that it looks at inode numbers or something to specially try to detect renames).


I mean, that visualizes the changes, but what I’m saying is that I think that it’d be possible to go further with collaborative art than having a one-pixel-per-person cooldown.


For example, I might self host a server just for my account but I read all my content from lemmy.world. Am I not using their bandwidth and their resources anyway?

Well, it’d use your CPU to generate the webpages that you view. But, yeah, it’d need to transfer anything that you subscribe to to your system via federation (though the federation stuff may be “lower priority” – I don’t know how lemmy and kbin deal with transferring data to federated servers rather than requests from users directly browsing them at the moment, but at least in theory, serving the user browsing directly has to have a higher priority to be usable).

But what would be more ideal – and people are going to have to find out what the scaling issues are with hard measurements, but this is probably a pretty reasonable guess – is to have a number of instances, with multiple users on each. Then, once lemmy.world transfers a given post or comment once via federation, that other instance stores it and can serve up the webpages and content to all of the users registered on that other instance.

If you spread out the communities, too, then it also spreads out the bandwidth required to propagate each post.

As it stands, at least on kbin (and I assume lemmy), images don’t transfer via federation, though, so they’re an exception – if you’re attaching a bunch of images to your comments, only one instance is serving them. My guess is that that may wind up producing scaling problems too, and I am not at all sure that all lemmy or kbin servers are going to be able to do image-hosting, at least in this fashion.


I remember, as a kid, once going to a Buddhist sand-painting exhibition at an art museum. They made these huge, beautiful mandalas by carefully shaking colored sand into designs. When they were done, they dumped it out into the ocean. I remember – being pretty impressed with it – asking something like “but why would you destroy it”, and the Buddhist monk guy said something like “it reminds us not to be too attached to material things”.

Don’t know if I agreed with the guy, but I think that there is probably a very real perspective out there that ephemerality has intrinsic value.


I don’t know what his argument is, but Stross’s account seems to be @cstross.


It looks like people have created /r/place alternatives. If you like /r/place, could just use one of those and go draw something neat and popularize that. I don’t really see what drawing a bunch of images complaining about spez using the service that spez runs is going to accomplish. On the other hand, if people are doing neat things elsewhere, then other people might want to participate.

https://old.reddit.com/r/place/comments/64zlnw/an_easy_guide_for_r_place_alternatives/

That was six years ago, so could be newer stuff out.


I can’t speak as to why other people use their alternatives, but if you use mpv with yt-dlp like the guy above, and which I do – which isn’t really a full replacement for YouTube, just for part of it – then you can use stuff like deblocking, interpolating, deinterlacing filters, hardware decoding, etc. Lets me use my own keybindings to move around and such. Seeking happens instantly, without rebuffering time.

Also means that your bandwidth isn’t a constraint on the resolution you use, since you aren’t streaming the content as you watch, though also means that you need to wait for the thing to download until you watch it.

There, one is talking about the difference between streaming and watching a local video, and that mpv is a considerably more-powerful and better-performing video player than YouTube’s client is.

I generally do it when I run into a long video or a series of videos that I know I’m going to want to probably watch.

EDIT: It also looks, from this test video, like YouTube’s web client doesn’t have functioning vsync on my system, so I get tearing, whereas mpv does not have that issue. That being said, I’m using a new video card, and it’s possible that there’s a way to eliminate that in-browser, and it’s possible that someone else’s system may not run into that – I’m not using a compositor, which is somewhat unusual these days.


No. Why, are they doing something like that?


Not really the focus of the article, but I think that /r/place was a neat idea, but hard to produce much with.

I feel like maybe there are forms of collaborative art that might go further, like letting people propose various changes to a chunk of pixels on an artwork and letting people vote on the changes.


I’m not familiar with whatever this Sandisk portable thing is, but SanDisk is a drive manufacturer, in which case it may be drive-level.

googles

Sounds like when it’s locked, the drive presents itself as a CD drive containing a Windows executable that unlocks it.

https://www.techrepublic.com/forums/discussions/how-do-i-fully-remove-sandisk-unlocker/

I wouldn’t expect Linux to be able to write to it if that’s it. It won’t even see the actual drive, just the non-writeable CD drive.

Honestly, I’d probably just write off the drive if the data isn’t important. The amount of time that’s required to basically get a used hard drive is probably not going to be worth it.


I remember this story from about twenty years back hitting the news:

https://www.theregister.com/2001/04/12/missing_novell_server_discovered_after/

Missing Novell server discovered after four years

In the kind of tale any aspiring BOFH would be able to dine out on for months, the University of North Carolina has finally located one of its most reliable servers - which nobody had seen for FOUR years.

One of the university’s Novell servers had been doing the business for years and nobody stopped to wonder where it was - until some bright spark realised an audit of the campus network was well overdue.

According to a report by Techweb it was only then that those campus techies realised they couldn’t find the server. Attempts to follow network cabling to find the missing box led to the discovery that maintenance workers had sealed the server behind a wall.


The problem is that I’m pretty sure that spammers are specifically targeting Google with a lot of their effort because of the size of its userbase.

So DDG or whoever else can be a solution for some, but if they get a big enough userbase, the SEO dollars are going to go towards hitting them too. Leveraging smaller size can’t be a fix for everyone.

Kinda like Reddit and the Fediverse. Right now – and in the past – there’s a limited amount of money in trying to jam spam in front of the userbase’s eyeballs on lemmy and kbin. But whenever the userbase grows by a factor of ten, so does the return-on-investment to a spammer in gaming their system. If the entire Reddit userbase collectively moved here tomorrow, the spammers would very quickly follow.


I’ve seen admins advising others to block EU in their firewall because they are aware of this liability and the lack of a privacy policy.

At least in the US, courts will not recognize EU jurisdiction over you and will not enforce EU policies against you unless you are actively doing business in the EU. Note that “doing business” may be a lower bar than you think – if you specifically advertise targeting people in the EU, that may qualify, say – but it is a higher bar than merely not being firewalled.

Now, you may still want to just block the EU or God knows what jurisdiction if you’re worried about being hassled, but you shouldn’t normally need to confirm to a country’s laws just because people in that country can reach your computer on the Internet.

IANAL.


This is what happens when you don’t push back against corporations.

I don’t feel like what Netflix did is unreasonable, as it’s described in the article. It says that anyone who already has the plan can continue with it. They just aren’t offering it to new users.

If someone made a decision to go with Netflix based on what was on offer, it sounds like that’s not being taken away or anything.

They didn’t lure in users with one service and the expectation that the service would continue to be available and then try to pressure them to get something else instead.


No, but you could start a list of magazines/sublemmies that are devoted to producing original content.


Do you feel that Nintendo should be obliged to divest itself of the Mario, Zelda, and Metroid franchises?


I liked Fallout 4. There were things that I didn’t like as much as New Vegas, sure, but I don’t see alternatives that are doing a better job, and that’s really the bar that counts.


Strictly-speaking, Usenet doesn’t have to be commercial, but while it was once typical for ISPs, universities, and other institutions to bundle Usenet service, now it typically is not, and so for most people, commercial Usenet access is really the only realistic alternative.


Cursive writing to be reintroduced in Ontario schools this fall
Relegated in 2006 to an optional piece of learning in Ontario elementary schools, cursive writing is set to return as a mandatory part of the curriculum starting in September.
fedilink

Or someone just makes a library that supports both.


For the sheer “WTF” factor, that probably wins, but as a practical gaming method, it might be doable to play DRL rather than Doom itself.


Looking at this:

https://en.wikipedia.org/wiki/Jolly_Roger

It looks like the main historical elements used in pirate vexillology were:

  • a skull

  • a skeleton

  • crossbones

  • a sword, spear, or dagger

  • a heart, often bleeding

  • an hourglass

  • elements of period seagoing clothing



Well, assuming that this is even directly related to the forum, as opposed to, say, email logs from the Reddit internal email server or something, things that might not be public:

  • Private messages between users.

  • Browsing data. I mean, maybe a user only posts on /r/politics, and that’s public, but spends a lot of time browsing /r/femdom or whatever.

  • IP addresses of users. Might be able to associate multiple accounts held by a user.

  • Passwords. While hopefully stored in a salted and hashed format, so they can’t be simply trivially obtained, they can still be attacked via dictionary attacks, which is why people are told not to use short and predictable passwords.

  • Email addresses (if a user registered one)

  • Reddit has some private chat feature that I’ve never used, which I imagine is logged.


Doesn’t actually matter. You can have HTTP URLs that bounce to content on the lemmy.dbzer0.com instance through other Fediverse instances, including both kbin and lemmy instances. For kbin, it’s an “/m/” prefix for “magazine”, and for lemmy a “/c/” prefix for “community”.

https://kbin.social/m/piracy@lemmy.dbzer0.com

https://lemmy.ml/c/piracy@lemmy.dbzer0.com

Every time someone adds a new Fediverse instance, it creates a new instance that can be bounced through.

Here’s a list of lemmy instances:

https://lemmy.fediverse.observer/list

And kbin instances:

https://kbin.fediverse.observer/list

People have been adding them quite quickly over the past few days.

Also, there are URL shorteners and all that sort of thing that can make it harder to block content via doing redirects in case they start doing something like doing a substring search on URLs for anything containing lemmy.dbzer0.com. I imagine that you scurvy scalawags of /r/piracy are probably more-familiar with what Reddit blocks in that area than I am. Might be a good idea to mix that in, just to discourage them trying to block large chunks of all of kbin/lemmy with the rationale that they’re trying to block you pirate folks from linking others to your pirate fortress.

EDIT: A bit more experimentation shows that whatever you guys have set up and however lemmy normally works, it apparently can be reached, albeit with a warning thrown up about a bad cert, via the IP address:

http://167.86.124.45/

This opens a number of interesting doors for linking to your outlaw port.

Unless the Reddit blocking code has fully-conformant-to-procotol parsing of IP addresses – and very, very few software packages do – it probably isn’t capable of reducing IP addresses in URLs to a unified format. However, the web browsers that users use normally have a fairly-robust implementation and probably can understand such interesting formats. And IP supports representations in other numeric bases. Here’s a handy base calculator:

https://www.rapidtables.com/convert/number/base-converter.html

So, for example, we could throw a little hex (base 16) in there with a leading “0x”:

http://167.0x56.124.45/

Maybe make it interesting by doing a little octal (base 8) on top of that, with a leading “0”:

http://0247.0x56.124.45/

Maybe merge the last two octets there…124*(2^8)+45=31789, so:

http://0247.0x56.31789/

A quick test also shows that aside from the cert warning in Firefox, it looks like your lemmy instance is fine with serving up your instance content to any browser that reached you having any hostname that maps to your IP. If any of you guys have any domains anywhere and can add an A record to it that points at 167.86.124.45, you can link to your lemmy instance via that hostname. If you get an actual cert for that hostname and throw it up on the server, and if whatever the server infrastructure for your lemmy instance is supports multiple certs – I have no idea, haven’t looked at it – then you can probably even get rid of the cert warning.

I can probably throw you some other ideas if they start cracking down on those. Feel free to ping me.


don’t know where they will live however, in the streets?

Canada does have the second-largest land area of any country on Earth, second only to Russia. I wouldn’t think that space would be a terrible constraint for Canada.


https://en.wikiquote.org/wiki/Donald_Knuth

Beware of bugs in the above code; I have only proved it correct, not tried it.

Donald Knuth’s webpage states the line was used to end a memo entitled Notes on the van Emde Boas construction of priority deques: An instructive use of recursion (1977)