A large chunk of the Fediverse was scraped; your posts are being “released” 

A large chunk of the Fediverse was scraped; your posts are being “released” 


@amphetamine@social.wxcafe.net @tastytea okay i'm pretty sure that this is assuming "implicit consent" which is a No No with IRBs. however!!! this may not be the kind of study that "needs" irb review.

it probably violates the GDPR though, and definitely some kind of institutional data ethics policy. i do work collecting behavioral data (with consent!) and we have to be really really careful with that even though it's not even under IRB review so? they could _definitely_ get in trouble if even one person whose stuff got in the scrape were to state that they didn't consent

@er1n @amphetamine @tastytea given i just sent them a strongly worded email stating the fact that no one on efdn consented, and im in the middle of writing a similar one to their university...

@amphetamine@social.wxcafe.net @tastytea useful stuff:

art 16.1, Feasibility, and social and environmental impact: "Should the project likely produce a significant impact on the objects of the research or, in general, on society, the environment or the biosphere, researchers shall responsibly examine the potential impact, providing details of these assessments in the appropriate documentation". which they sort of did, but not really

25.1, Protection of persons involved in research: "Researchers shall pay due respect to all persons involved in their research, without compromising their health, the wellbeing of the community, and the safety and healthiness of the environment in which they work."


@amphetamine@social.wxcafe.net @tastytea

and here's the fucking kicker:
26.1, Informed consent: "Without prejudice to the principle of due respect for human dignity and autonomy, should theresearch entail the involvement of recruited participants, the research leader ensures that applicable norms on informed consent are respected, with special regard to incompetent subjects or, in any event, to individuals unable to give consent."

which they did not do! so they grossly violated UMilan's ethics code and they _absolutely_ can get in trouble

@bstacey @er1n @amphetamine @tastytea In case anyone has failed to point this out.. if they forgot to remove identifying data from the set then they've put umilan on the hook for a GDPR violation. Which can be up to 4% of yearly turnover in fines.

okay, i condensed everything the scrapers at UMilan violated and some grievance policy documentation. have fun and don't trash my doc!


@bstacey @amphetamine@social.wxcafe.net @tastytea

@bgcarlisle i feel like you might be interested in this, particularly the fact that 10k posts from scholar.social made it in (as noted in the doc one post upthread) in contravention of your use policy

@er1n @bgcarlisle It looks like the letter has stabilized (the only changes today were 3 new signatures and my moving the Codice Etico excerpts to the end as an appendix of sorts). Maybe it's time to freeze the text and move on to the next stage?

@er1n Hmm, maybe you should switch the doc to read-only (I think the creator is the one who has that option?) to prevent accidental slip-ups, and we can prepare some stern emails to send?


@er1n @bgcarlisle OK. I will try to make a nicely formatted version. Fair warning, though --- I have garbage brain this week, so I might be a bit slow with it, just from general disorganization.

I was wondering. Should I sign the letter too ?

I only see administrators in the list so I'm a bit hesitant to do it ^^'

@er1n @bgcarlisle


If you'd like to sign, you're welcome to. I'm not an administrator myself! :-) The document is read-only now, but I can add you when I make a nicely-formatted version. How would you like to be credited?

@er1n @bgcarlisle

Being credited as an user of miaou.drycat.fr with my handle would be nice.

Thanks for making the letter that will certainly make them react :blobcat:

Is there a way to get a follow up on the authors' and universities' answer ?

@er1n @bgcarlisle

@l4p1n Before we send the letter (a job which I guess is currently falling to me by default?), I'd like to get a list of people who want to be CC'ed on the email. Any response we get should be received by all those who take an active interest and shared with the community, I think.

@er1n @bgcarlisle

Me, mail address is in the profile. Or I can share you my personal email by DM
@er1n @bgcarlisle

@er1n @bstacey @amphetamine @tastytea Thanks for putting this together...I am going to download the dataset for myself to understand what they got.

@er1n @amphetamine @tastytea Thanks. It feels good to unleash my Angry Academic in the cause of justice.

@er1n @amphetamine @tastytea A note that is probably too tangential to go into the letter: They couldn't even get the definition of "federated timeline" right.

@er1n thank you for writing this, it's really helpful!

@melodicake this is a bunch of people at this point! but thank you <3

@er1n @bstacey @amphetamine @tastytea We checked the instance list and so you can add discrimination and reckless endangerment to the list.

We counted around a half dozen "free speech" instances and none of the major ones are there, so they're basically targetting left-wingers here, which means majority members of marginalized groups and therefore people that are more likely to be victims of state-sponsored violence.

@KitsuneAlicia @er1n @amphetamine @tastytea It appears that they didn't actually say where they got their list of instances from, which is just one more example of their general level of harmful incompetence.

@tastytea @er1n @amphetamine @KitsuneAlicia Yeah, I managed to miss that on a first read (the sentence is split across pages and my eyes were getting blurry looking for it, which was my clue to step away and rest a bit).

@er1n @bstacey @amphetamine @tastytea Is anyone going to be able to sign this document? Because I would like them to scrub any data they got.

@NeoAJ @er1n @bstacey @amphetamine @tastytea

I would sign the document as an user of miaou.drycat.fr that would love to get their data removed from their systems.

I wonder if users count towards the signature count 🤔

@er1n @amphetamine @tastytea

1. I wrote an introduction to summarize the incident.
2. I added a description of what local and federated timelines are to the "Mistaken classification of post privacy" section, in order to help distinguish what the authors conflate.
3. In "Failure to de-identify data", I shortened the big paragraph, since some of what it talked about didn't seem applicable to scraping by way of public TLs. Also, I added a quick paragraph about the risk of de-anonymization.
4. I snarked about their not disclosing their funding sources.
5. I added the sentence, "We express our gratitude to the administrators of Harvard Dataverse for acting promptly to deaccession the dataset."

re: Responding to Fediverse scraping 

@er1n @amphetamine @tastytea Hi, Queer.party is being scrapped so I can state that!

@er1n @amphetamine @tastytea probably violates dmca assuming users keep copyright to their content

Sign in to participate in the conversation

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!