Description of your request or bug report:
Allow CSV data dump
Trello link:
Description of your request or bug report:
Allow CSV data dump
Trello link:
Per a private DM, Iâm considering doing this this week. It really should only take a day and I think you all could come up with some cool visualizations!
(It also makes onboarding easier⌠people feel better if they can backup their stats)
OMG. This would be amazing. I could finally stop tracking my Japanese stuff on GR.
We could even probably do a fancy community edited xls that reads from the CSV and does fancy graphics and stats .
And then Brandon might like them and copy them into the site
donât spoil his master plan.
Ok, I got the first data download available! Itâs just your user books data, not any reading sessions. But let me know what you think of the process, UI & the fields I surface.
You can find it here: https://learnnatively.com/account-settings/?form=data_download.
Not working for me. at first it didnât do anything and now itâs telling me âissues with our serversâ.
Nice, Thanks!
Would it be too hard to include the reading sessions on it?
Or even a separated CSV if itâs too complicated to merge it in the same table, then throw the documents in a .zip / .7z / .tar whatever.
Maybe it broke after generating mine
Sorry I shouldâve said more explicitly. Yes, there will be more data that you can download⌠theyâll just be separate files. Iâm imagining one for general data for each type and one for sessions for each of movies, tv shows & books. Ultimately 6 in total.
Oops! Well first bug to squashâŚ
Thanks! BTW if you want to put the cherry on top. a small .txt document explaining each file would make it extra fancy.
Also while I doubt itâs too big of a deal, I noticed that you could potentially guess the URL of other users if you know when they have generated the file.
Just a minor security thing Iâd though Iâd mention.
And also, are the files deleted after a while is my guess?
So there is a âkey fileâ that you can download⌠itâs next to the âUser Bookâ title. There is a slight bug rn where it disappears however⌠but if you reload the page itâll be there.
Yeah i thought about that⌠maybe iâll add a random string at the end. Originally i had a totally random file name but I didnât want the filename downloaded to be random, as thatâd be unfortunate. I tried using an html attribute âdownloadâ but chrome wasnât abiding by it and apparently itâs very flaky⌠so I decided this approach.
Not at the moment, but eventually iâd probably clean up yes.
Guess the string would be enough, havenât worked with cloudfront but if you donât want to pollute the filename, you could create a signed URL:
But itâs probably too much of a headache, although maybe such a function would have an use somewhere else
YESSSSSSSSSS will check this out when iâm on my not-work pc
hah, yeah thatâs for truly private stuff and I imagine would be a real pain to figure out, as most amazon things are for me
I think itâd be pretty impossible to guess if I put a random 6 characters on itâŚ
Sorry, I had to do it
You should be good to go!
And i fixed the âkey fileâ disappearing bug.
I will try to get reading session data up by tonight. Doing the tv data after that should be pretty straightforward hopefully.
Edit: I also added a random string to the file name
Beautiful! I was a bit worried because my OpenOffice whatever couldnât parse the Japanese properly, but I just stuffed it into google sheets and it works fine there. Time to start trying to play with the visualisers! Canât wait for the session data!
Could we have an interface to this that is automatable, please? I would like to be able to have a cron job on my local machine that backs up the data once a week. I do this at the moment with booklog.jp, which works because they have a relatively easily scrapable interface that doesnât require pressing any javascript buttons, so I can grab the csv with a couple of wget invocations.
# as of some time in 2019 their server started insisting on referer header
wget --save-cookies cookies.txt -O login.html --post-data='service=booklog&ref=&account=pm215&password='"$PASSWORD" --referer=https://booklog.jp/login https://booklog.jp/login
# now we need to load this web page, to fish out a specific link from it
wget --load-cookies cookies.txt -O export.html https://booklog.jp/export
DOWNLOADURL="$(sed -ne 's/.*\(https:\/\/download.booklog.jp[^"]*\).*/\1/p' export.html)"
if [ "$(echo "$DOWNLOADURL" | grep -c https)" -ne 1 ] ; then
echo "Failed to find download URL in export.html!"
exit 1
fi
echo "Loading csv from $DOWNLOADURL"
wget --load-cookies cookies.txt -O "$OUTFILE" "$DOWNLOADURL"
Another item for the future API