Product Updates & Casual Natively Discussion

You mean you don’t have an SRS deck for memorising unicode codepoint to kanji mappings? :slight_smile:

3 Likes

And instead of the default Anki multiple choice, you have to input the codepoint :upside_down_face:

1 Like

I knew about unidecode, but it defaults to Chinese mappings.

On searching further, I’ve found miurahr/unihandecode: unihandecode is a transliteration library to convert all characters/words in Unicode into ASCII alphabet that aware with Language preference priorities - Codeberg.org

Otherwise, Japanese segmentation libraries might help with transliteration.

In full U+1234 U+5678 format

3 Likes

I’m not quite following. Can you give me an example?

The URL for 赤ずきん、旅の途中で死体と出会う 🐺 Home Thread 🔪 is simply https://forums.learnnatively.com/t/home-thread/7751. Very non-specific.

The URL for きらきらひかる (Profoundly Weird Book Club) is https://forums.learnnatively.com/t/profoundly-weird-book-club/5064. Wrong meaning.

I think that’s more of a Discourse limitation than anything else.

1 Like

WaniKani Community uses percent-encoding, so preserving unicode characters when decoded. However, it’s not readable by default.

A possible solution I thought about is using Japanese segmentation.

$ sudachi -a
赤ずきん、旅の途中で死体と出会う wolf Home Thread hocho
赤      名詞,普通名詞,一般,*,*,*        赤      赤      アカ    0       [17550]
ずきん  副詞,*,*,*,*,*  ずきん  ずきん  ズキン  0       []
、      補助記号,読点,*,*,*,*   、      、      、      0       []
旅の途中で      名詞,固有名詞,一般,*,*,*        旅の途中で      旅の途中で      タビノトチュウデ        0       []
死体    名詞,普通名詞,一般,*,*,*        死体    死体    シタイ  0       [24863]
と      助詞,格助詞,*,*,*,*     と      と      ト      0       []
出会う  動詞,一般,*,*,五段-ワア行,連体形-一般   出会う  出会う  デアウ  0       []
        空白,*,*,*,*,*                  キゴウ  0       []
wolf    名詞,固有名詞,人名,一般,*,*     Wolf    Wolf    ウォルフ        0       [20699]
        空白,*,*,*,*,*                  キゴウ  0       []
Home    名詞,普通名詞,一般,*,*,*        ホーム  home    ホーム  0       [14167]
        空白,*,*,*,*,*                  キゴウ  0       []
Thread  名詞,普通名詞,一般,*,*,*        スレッド        Thread  スレッド        0       [412]
        空白,*,*,*,*,*                  キゴウ  0       []
hocho   名詞,普通名詞,一般,*,*,*        hocho   hocho   hocho   -1      []      (OOV)
EOS

So aka-zukin-tabinotochuude-shitai-to-deau-wolf-home-thread-hocho

I wonder if they changed the code or is a non-default setting.

The fact is that you can put anything between the /t/ and the post number, as it’s just a SEO thing

https://forums.learnnatively.com/t/megumin-is-best-girl/5064

This will still work and load the proper preview:

Edit:

Seems like @seanblue went through the discourse URL shenanigans in the past:

So it could be a mix of a setting not enabled in Discourse and/or a web server configuration limitation.

4 Likes

You sure about that? :rofl:
image

5 Likes

casts explosion

Shame @biblio had a terrible accident before completing the 2024 bingo.

6 Likes

@polv @Megumin So I’ve switched the url generation to include japanese letters.

See 赤ずきん、旅の途中で死体と出会う 🐺 Home Thread 🔪

However, the sharing of the url now gets very long… which honestly makes me a bit hesitant to do this :thinking:. Although it’s probably better for SEO, slightly.

(ex link: %E8%B5%A4%E3%81%9A%E3%81%8D%E3%82%93%E3%80%81%E6%97%85%E3%81%AE%E9%80%94%E4%B8%AD%E3%81%A7%E6%AD%BB%E4%BD%93%E3%81%A8%E5%87%BA%E4%BC%9A%E3%81%86-home-thread/7751?u=brandon").

What do we think?

Should we show Japanese characters in forum thread urls?
  • Yes. Even though it makes sharing the url slightly harder. It’s pretty.
  • No. While it looks better, I don’t like the long copied url.
  • I don’t care!
0 voters
2 Likes

I think until browsers handle these things more gracefully is better without IMHO.

1 Like

I would actually prefer all of these texts to just be replaced by “x” :sweat_smile:
That would save me from replacing them manually each time I want to put a link somewhere :grin:

8 Likes

Ok! I’ve decided to revert it back to ascii only.

@nikoru hmm, interesting. Unfortunately, since there are other languages, I think it’s slightly different for Japanese. I guess we could do another poll :thinking:

2 Likes

I usually do that with amazon links, but I even strip the ascii, just leave the dp/ASIN lol.

I’m saving a lot of bytes of information!

3 Likes

Yeah I always strip them to the shortest format possible. Good to know about url bit here being for SEO purposes

1 Like

Would it be impossible to implement @polv ’s suggestion? If I understand it right, it would change the url to make something like ワンピース book club read like wanpi-su-book-club

chart-y kind of q, is there a way to create a page tracking chart even when you don’t time your reads? i would like to track I read X pages on Y day over time until a book is completed (or i guess more pedantically I logged X pages, here’s the amount on the progress bar this went as percentage, eg).

… reading that back I think I will probably need to scribble down a visualisation or something to make that ramble make any sense.

2 Likes

You need to change/add the functionality to discourse source, makes the URLs probably extremely long, and being a romanization it probably hurts SEO.

Also it would need to distinguish between different languages.

1 Like