I wonder if forum URL can be better, while avoiding percent encoding. Like, transcribing Kanji (assuming Japanese), Kana and Hangul to alphabet, before clipping out non-ASCII. Transcription may have errors, but that’s better than unspecific/misleading URLs. (Also wondering about brackets.)
You mean you don’t have an SRS deck for memorising unicode codepoint to kanji mappings?
And instead of the default Anki multiple choice, you have to input the codepoint
I knew about unidecode, but it defaults to Chinese mappings.
On searching further, I’ve found miurahr/unihandecode: unihandecode is a transliteration library to convert all characters/words in Unicode into ASCII alphabet that aware with Language preference priorities - Codeberg.org
Otherwise, Japanese segmentation libraries might help with transliteration.
In full U+1234 U+5678 format
I’m not quite following. Can you give me an example?
The URL for 赤ずきん、旅の途中で死体と出会う 🐺 Home Thread 🔪 is simply https://forums.learnnatively.com/t/home-thread/7751
. Very non-specific.
The URL for きらきらひかる (Profoundly Weird Book Club) is https://forums.learnnatively.com/t/profoundly-weird-book-club/5064
. Wrong meaning.
I think that’s more of a Discourse limitation than anything else.
WaniKani Community uses percent-encoding, so preserving unicode characters when decoded. However, it’s not readable by default.
A possible solution I thought about is using Japanese segmentation.
$ sudachi -a
赤ずきん、旅の途中で死体と出会う wolf Home Thread hocho
赤 名詞,普通名詞,一般,*,*,* 赤 赤 アカ 0 [17550]
ずきん 副詞,*,*,*,*,* ずきん ずきん ズキン 0 []
、 補助記号,読点,*,*,*,* 、 、 、 0 []
旅の途中で 名詞,固有名詞,一般,*,*,* 旅の途中で 旅の途中で タビノトチュウデ 0 []
死体 名詞,普通名詞,一般,*,*,* 死体 死体 シタイ 0 [24863]
と 助詞,格助詞,*,*,*,* と と ト 0 []
出会う 動詞,一般,*,*,五段-ワア行,連体形-一般 出会う 出会う デアウ 0 []
空白,*,*,*,*,* キゴウ 0 []
wolf 名詞,固有名詞,人名,一般,*,* Wolf Wolf ウォルフ 0 [20699]
空白,*,*,*,*,* キゴウ 0 []
Home 名詞,普通名詞,一般,*,*,* ホーム home ホーム 0 [14167]
空白,*,*,*,*,* キゴウ 0 []
Thread 名詞,普通名詞,一般,*,*,* スレッド Thread スレッド 0 [412]
空白,*,*,*,*,* キゴウ 0 []
hocho 名詞,普通名詞,一般,*,*,* hocho hocho hocho -1 [] (OOV)
EOS
So aka-zukin-tabinotochuude-shitai-to-deau-wolf-home-thread-hocho
I wonder if they changed the code or is a non-default setting.
The fact is that you can put anything between the /t/ and the post number, as it’s just a SEO thing
https://forums.learnnatively.com/t/megumin-is-best-girl/5064
This will still work and load the proper preview:
Edit:
Seems like @seanblue went through the discourse URL shenanigans in the past:
So it could be a mix of a setting not enabled in Discourse and/or a web server configuration limitation.
You sure about that?
casts explosion
Shame @biblio had a terrible accident before completing the 2024 bingo.
@polv @Megumin So I’ve switched the url generation to include japanese letters.
See 赤ずきん、旅の途中で死体と出会う 🐺 Home Thread 🔪
However, the sharing of the url now gets very long… which honestly makes me a bit hesitant to do this . Although it’s probably better for SEO, slightly.
(ex link: %E8%B5%A4%E3%81%9A%E3%81%8D%E3%82%93%E3%80%81%E6%97%85%E3%81%AE%E9%80%94%E4%B8%AD%E3%81%A7%E6%AD%BB%E4%BD%93%E3%81%A8%E5%87%BA%E4%BC%9A%E3%81%86-home-thread/7751?u=brandon").
What do we think?
- Yes. Even though it makes sharing the url slightly harder. It’s pretty.
- No. While it looks better, I don’t like the long copied url.
- I don’t care!
I think until browsers handle these things more gracefully is better without IMHO.
I would actually prefer all of these texts to just be replaced by “x”
That would save me from replacing them manually each time I want to put a link somewhere
Ok! I’ve decided to revert it back to ascii only.
@nikoru hmm, interesting. Unfortunately, since there are other languages, I think it’s slightly different for Japanese. I guess we could do another poll
I usually do that with amazon links, but I even strip the ascii, just leave the dp/ASIN lol.
I’m saving a lot of bytes of information!
Yeah I always strip them to the shortest format possible. Good to know about url bit here being for SEO purposes
Would it be impossible to implement @polv ’s suggestion? If I understand it right, it would change the url to make something like ワンピース book club read like wanpi-su-book-club
chart-y kind of q, is there a way to create a page tracking chart even when you don’t time your reads? i would like to track I read X pages on Y day over time until a book is completed (or i guess more pedantically I logged X pages, here’s the amount on the progress bar this went as percentage, eg).
… reading that back I think I will probably need to scribble down a visualisation or something to make that ramble make any sense.