Fresh Hacker News | Why does my ripped CD have messed up track names? And why is one track missing?

▲Why does my ripped CD have messed up track names? And why is one track missing?(akpain.net)

149 points by surprisetalk 1 day ago | 26 comments

A maintain my own digital music collection. The only two tools I use for maintaining the CD portion of my collection are k3b and MusicBrainz Picard. k3b can rip to flac and it will on embed metadata present on the CD itself. Then after I rip it, I add it to Picard.

I use the "lookup CD" feature in Picard, which gives me a selection of releases to choose from. Among the choices, I usually see a release matching the catalog number on my CD's case. When I don't see a matching release, I will typically add the disc ID to an existing release, or I will create a new release, or sometimes even creating a new release + new release group and add the necessary metadata to MusicBrainz.

I haven't tried any automatic tagging process like the ripping program the article talks about does, mostly because I want to use Picard to make sure the metadata is correct or contribute to MusicBrainz if it isn't.

I like MusicBrainz a lot because applications like Plex use it very well to group release groups together and will (usually) deduplicate identical recordings so that identical tracks can share a rating. It's a really great database and is kept up to date pretty well.

▲CharlesW 19 hours ago

MusicBrainz Picard is wonderful, but has one of the most unintuitive "first contact" experiences I can remember. If you're not sure how to get started, try this:

• Drag your album folders (one at a time so it doesn't get confused) into the pane that initially shows "Unclustered Files (0)" and "Clusters (0)".

• Select the "Clusters" folder in that pane and click "Lookup". This will find any close matches, and in my experience works ~25% of the time.

• For albums that weren't auto-matched, right-click the album folder name and choose "Search for similar albums…". As long as you're sorting by "Score", often you'll find a reasonably-good match in the top 5 options.

• NEVER use "Scan", basically.

For matched albums, carefully review things like album covers, titles, etc. before you "Save" the updated metadata. After using it to rebuild my personal music library, including ~200 contributions to the MusicBrainz database, I still haven't cracked (for example) how to stop Picard from defaultly replacing a perfect, 1500px album cover with a less-good, 1000px cover from its database.

▲sumtechguy 2 hours ago

For cover art you can control it from options->options->cover art. There are also a couple of plugins for other sources.

There are a few items in there to control if it scans external or overwrite. Recently went thru this as apparently for some reason I had totally disabled it. Think I was trying to speed up scanning as it would download every artwork for a large group into the temp folder. I usually force it to make an external file. I pick what it suggested 'cover'. Then use something like fileoptimizer to recompress the jpg/png it comes up with. I do that because I like to embed the images. And much of what is out on the net is optimized for fast editing not 'archive'. I use mp3tag to put it back into the tag.

Scan is hit or miss. I have fed it whole albums and it will somehow find 3 other albums with some of the songs from that one. That could be because of how I have options->options->metadata->Prefered Releases set. That slider bar thing for some reason I can not wrap my head around. It is good for when you come across one of those items where someone else tagged it as 'weird al' (everything is weird al if it is funny). I have been slowly getting rid of that stuff but want to find the original album to buy. Musicbrainz can be good for that sort of thing. I have also had decent luck with it if I pre-add the albums then scan. It seems to find things better.

▲lloeki 8 hours ago

> MusicBrainz Picard is wonderful, but has one of the most unintuitive "first contact" experiences I can remember.

Seconded, it's the best specialised UI I've seen in a while.

By "specialised" I mean it's entirely bespoke to a specific task and no other, with a small amount of dedicated jargon, like those industrial control panels full of buttons, toggles, and blinkenlights.

At first it's completely alien and appears to do weird stuff, possibly counterintuitive even (the mentioned "Scan" usage†, "what are clusters?", "why do I even need to cluster first?", "how do I save changes?")

But once you get the hang of it it's incredibly efficient with a ton of small niceties, like dragging a selection of entries from the left side will apply whatever candidates you have on the right side to the selection in order starting from the first.

† I use scanning only when album matching fails for whatever reason, it does sometimes unearth entries that wouldn't appear otherwise.

▲ 3 hours ago

▲fsckboy 12 hours ago

>NEVER use "Scan", basically

never use "scan" because it will never work? or because it is somehow destructive and will mess up your "cataloging"?

▲CharlesW 11 hours ago

I have no doubt that it sometimes works, and would be happy to accept a verdict of "skill issue" if the problem is me.

Scanning a Cluster should (IMO) cause Picard to generate a series of AcoustID fingerprints/IDs from the tracks, then use that series to identify the best match (with extra points for handling missing tracks, etc.). But especially in the case of collections/compilations, the end result often resembles a transporter accident. Thankfully it's non-destructive, so it's straightforward to "Remove" all of the tracks you dragged in earlier along with the various albums that MusicBrainz created during the discombobulation process.

To be clear, my overall opinion of MusicBrainz and MusicBrainz Picard is that they are unappreciated triumphs. It would be nice if Wikipedia and Internet Archive diverted 0.01% of their fundraising to them. Google is the primary hero in their story, supporting them with over $500K so far. https://metabrainz.org/sponsors

▲prmoustache 2 hours ago

What does "maintain" means in that context? Once you have ripped your cd and stored it somewhere there is nothing to maintain afaik (well appart than having backups if you don't want to ever rip them again).

▲mayneack 21 hours ago

Yeah, imo using musicbrainz/picard is great for the process of bringing something into your collection. I encounter errors like others here have mentioned, but they're straightforward to fix. Importantly, it sets up a reference to an evolving update process so changes down the line can get back to my files cleanly.

▲commotionfever 20 hours ago

since you mention Picard and wanting contribute to MusicBrainz. I'm working on a new fast tagger[1] in the spirit of Picard or beets. Just a little different and more scriptable

It makes it's best attempt to match with MusicBrainz, but if there's no match it it offers links to pre-seed MusicBrainz with tools like Harmony

https://github.com/sentriz/wrtag

▲CharlesW 19 hours ago

Harmony (https://harmony.pulsewidth.org.uk/) is amazing, and completely changed my relationship with MusicBrainz.

What are you using for tag reading/writing in Go? Robust, complete options are non-existent in JavaScript land (Deno, Bun, Node, etc.), so I ended up creating a Wasm version of TagLib with a TypeScript API.

▲commotionfever 18 hours ago

haha that's funny! I made a WASM TagLib for Go

https://github.com/sentriz/go-taglib

▲CharlesW 17 hours ago

Cooool, I love that you arrived at the same conclusion! Mine's not ready for its ShowHN, but as an enthusiast, I'm super-excited to dig into yours. Very nice work!

▲riedel 12 hours ago

I recommend also AudioRanger for resorting and moving stuff into the right places. For ripping I use ExactAudioCopy, which supports also flac.

▲mikepavone 15 hours ago

This is a small point, but calling the 33-byte unit a sector in CDDA is a bit misleading and probably incorrect for the quantity being labeled. This is a channel data frame and contains 24-bytes of audio data, 1 byte of subcode data (except for the channel data frames that have sync symbols instead) and the rest is error correction. This is the smallest grouping of data in CDDA, but it's not really an individually addressable unit.

98 of these channel data frames make up a timecode frame which represents 1/75th of a second of audio and has 2352 audio data bytes, 96 subcode bytes (2 frames have sync codes instead) with the remainder being sync and error correction. Timecode frames are addressable (via the timecodes embedded in the subcode data) and are the unit referred to in the TOC. This is probably what's being called a sector here. Notably, a CD-ROM sector corresponds 1:1 with a timecode frame.

Note: Red book actually just confusingly calls both of these things frames and does not use the terms "channel data frame" or "timecode frame"

▲Lammy 21 hours ago

I used to do the MusicBrainz thing with Picard and later with Beets, but I got sick of Somebody Else's Metadata because of MusicBrainz's (former?) policy where everything must be Title Cased regardless of how it's presented on the CD sleeve. I prefer my tags to match the artist's choice, because I consider it a tonal indicator that helps set the mood for the work.

It seems like they might not enforce that any more since the album I was going to pick on as an example is now tagged like I have it, although I also have lower-case “my bloody valentine” Artist tags on every track with Title Cased “My Bloody Valentime” Album-Artist tag for browsing in Navidrome: https://musicbrainz.org/release/1e4c282b-8b0d-4d20-9f74-175f...

…but I already got out of the habit and will still just keep typing them out myself :)

I also always include the catalog number in the Comment field and in brackets in my folder names to separate different releases of supposedly the same thing. Good example of why you would want to do this is the 2004 vs the 2007 releases of MM..FOOD? where the last track (Kookies) had to be redone to remove the Sesame Street samples:

- 2004: https://www.youtube.com/watch?v=Ci_XcL4nYos

- 2007: https://www.youtube.com/watch?v=8iYSwvdEfeY

Shout-out to https://covers.musichoarders.xyz/ and https://fanart.tv/ for high-quality album art to embed.

▲qingcharles 16 hours ago

As someone responsible for setting the track naming policy originally at a big streaming company, I can't remember what the policy was. I know I would be called in all the time for crazy shit like Aphex Twin having just a page of equations as track names, or I seem to remember some album by Röyksopp that had just colors printed for the track names and no words. That stuff killed me.

Or the team doing all the ingestion being overworked minimum wage high school grads and suddenly an entire semi truck turns up and it's just palettes of CDs completely in many various East Asian languages.

If I had to do it over I would have two fields, one for whatever best represented what the CD says (and as someone below me points out, this was usually the publisher's artistic discretion and differed between the data they sent, the back of the CD, the track list printed on the CD and the liner notes) and I would have a separate field for Title Cased Titles.

▲tom_ 13 hours ago

Aphex Twin's Selected Ambient Works 2 had a track listing that was 6 pie charts, one image per slice, 1 slice per track. I assume it came on 3 LPs, but I had the 2 CD version, and there were corresponding pie charts printed on the face up sides of the CDs... as if it made it any clearer.

I ripped them about 15 years ago and cddb came up with track names for them, matching the ones in its Wikipedia page (https://en.wikipedia.org/wiki/Selected_Ambient_Works_Volume_...). I wonder if we have any evidence that the mapping from tracks to images is remotely correct.

▲qingcharles 9 hours ago

I came up with the first CD DB for Windows 95 just before it came out. I realized Media Player had an .ini file with the track names and so I had people on Usenet send me all their listings and would reintegrate it and publish it frequently for the first few weeks until I realized .ini files had a 64KB limit and that was the end of that.

If I did it again, which I have planned for a long time, I would require citations for every track listing. Sure, it's a big barrier, but it'd nice to get it right where possible. The primary citations would generally be to the album cover, but in cases like the Aphex Twin insanity, cites to things like interviews and label demo releases etc could definitely be valid.

▲JohnFen 20 hours ago

> MusicBrainz's (former?) policy where everything must be Title Cased regardless of how it's presented on the CD sleeve.

Is that why that happens? It was always a baffling thing to me and required manual correction (and is one of the sorts of errors that made MusicBrainz less useful).

▲pavon 19 hours ago

Part of the difficulty is that artists/labels aren't always consistent about the formatting of song titles. Its not uncommon for the capitalization to vary between the back cover of the CD, the printing on the CD itself and the liner notes. And then you have variations between releases of the same CD, and digital releases where the file metadata, and the store listing, and the artist website also all vary. So I can't blame MusicBrainz for choosing to normalize by default. Ideally, you could use normalized case for the Recording and Work song titles, and then stylized for the Release song titles, but most people don't go to that level of detail when entering songs.

▲JohnFen 17 hours ago

Oh, I understand the problem, and I don't blame them either. However, it is a part of why these services stopped being useful to me.

▲amiga386 21 hours ago

> policy where everything must be Title Cased regardless of how it's presented on the CD sleeve

If the music artist decided how it should be on the CD sleeve, and you can show that, then you can go with that. But more often than not, the sleeve is done by the record company's graphic designers, not the music artist.

https://musicbrainz.org/doc/Style/Titles

> Album and song titles are often found in upper‐case on the back cover of CDs. For example, the album Songs of Love and Hate is written as “SONGS OF LOVE AND HATE” on the cover. This is usually the choice of a graphic designer, not the artist. So, instead of copying the title from the cover, we follow certain rules to capitalize a title.

https://musicbrainz.org/doc/Style/Principle/Error_correction...

> Error Correction: There are many cases of record companies incorrectly reproducing titles or even artist names, or breaking generally accepted rules of usage for stylistic purposes. In such cases it often makes sense to fix errors and standardize irregularities, valuing correct spelling, punctuation and grammar over faithfulness to the printed release cover.

> Artist Intent: Artists sometimes choose to present names and titles in ways that deliberately contradict the rules of the language they're in (e.g. unorthodox spellings) and/or the MusicBrainz Style Guidelines. To describe the way we handle such choices, we use the term "artist intent." The general idea is that if an artist intended something to be written in a special way, then MusicBrainz should follow that intent. Unfortunately, it can be difficult to find out what an artist intended. If you want to claim that some deviation from the Style Guidelines should be considered artist intent, the burden of proof lies on you.

▲ItsHarper 16 hours ago

Seems reasonable. I'd think this should be pretty straightforward for songs new enough to be released online. If it's capitalized a certain way on Spotify, that's almost certainly what the artist intended.

▲ 20 hours ago

▲Avamander 21 hours ago

I can't recall when something like that was enforced. Artistic intent is definitely something that editors and guidelines intend to preserve. Though in some cases it might be hard to determine if something is a mistake or intentional - there are incredibly weird releases.

▲ubermonkey 6 minutes ago

You know, I've only ever ripped with iTunes or Music on a Mac, and I've never run into this over (at this point) decades and thousands of rips. Am I just lucky?

▲pflenker 22 hours ago

Somewhat related: some conscious artistic choices - such as writing down two tracks but delivering them as one (not sure if this is what happened here) can’t really be transferred into databases.

I own a cd where one track name is a small icon depicting a heart stabbed with a rather lengthy knife. To my knowledge, this track has no canonical name. Any digital version of this cd betrays the respective author‘s interpretation of the icon.

And then, of course, there’s „Love Symbol“: https://en.wikipedia.org/wiki/Prince_(musician)

▲FearNotDaniel 4 hours ago

> can’t really be transferred into databases

Of course they can, it's up to the person designing the database schema to anticipate what is a common artistic practice and model the data accordingly. It might be that specific databases like MusicBrainz and Gracenote haven't accounted for that, but if you own the schema you can easily set up a one-to-many (parent/child) relationship between physical track and song name.

One extreme example of this would be the "Lovesexy" album by (the artist formerly known as) Prince, which in its original CD form had only one track, containing 9 songs. I think the Spotify version is still faithful to this.

This and many other common "conscious artistic choices" ought to be collected into a "Falsehoods Programmers Believe About Recorded Music", if that is not already a thing...

In your example above, yes it's true that many song titles and artist names are fully and partially graphic symbols with no direct text representation (another thing Prince was fond of), but again given the prevalence of this there's no reason a smart data schema couldn't model a song or artist having a 'canonical' name that can only be represented by some graphic format along with one or more pronounceable/text-encodable alternatives (TAFKAP/Love Symbol) and so on; and of course tracking the fact that the 'preferred' identifier can change over time (Cat Stevens/Yusuf Islam, to mention a non-Prince example).

▲pflenker 6 minutes ago

It’s entirely possible to come up with artistic choices which precisely aim to be impossible to capture. So at times it’s not due to falsehoods someone believes, but rather the opposite - artists deliberately breaking the limits of what’s currently possible. Not a direct CD example, but one vinyl (was it by Pink Floyd?) has a special last track, where the needle is redirected endlessly, making the last track effectively endless. Or double grooved vinyls, Track 0s on CDs and so on.

But to stay with my example of the stabbed heart - even if the DB supported it, you’d still have to make choices when converting the icon as printed into the database, such as coloring.

▲indrora 20 hours ago

How about "Naming the CDs"

There's a handful of albums that MusicBrainz doesn't quite have the right cd naming for since one was labeled "LEFT" and the other "RIGHT" and not 1/2 -- there is no canonical 1/2 order.

▲Sniffnoy 21 hours ago

What's the CD?

▲pflenker 14 minutes ago

The Inchtabokatables - Ultra

▲ 19 hours ago

▲sandreas 12 hours ago

Maybe for archival Purposes you could use `redumper` (https://github.com/superg/redumper) to prevent ripping mistakes.

My personal workflow:

  - rip the audio CD via EAC with acousticID (flac)
  - retrieve metadata via beets in a script completely automated
  - convert flac to mp3 via beets inplace convert (see below)
  - backup the flac files to another location
  - self-host navidrome and use the substreamer / dsub app and smart playlists to listen "on the go" (The Apple usb-c-to-audiojack adapter is pretty decent)
  - transfer this via iTunes VM to my good old iPod Nano 7g as main listening device for audiobooks

If anyone is looking for fast and accurate ripping hardware, recently I updated my recommended hardware list including a linked tutorial for EAC:

https://pilabor.com/blog/2022/10/audio-cd-ripping-hardware/

beets convert config:

  convert:
    auto: no
    ffmpeg: /usr/bin/ffmpeg
    opts: -codec:a libmp3lame -qscale:a 0 -ac 2 -ar 48000 -map_metadata 0 -movflags use_metadata_tags
    max_bitrate: 192
    threads: 1

▲maeln 8 hours ago

~Why use MP3 instead of opus, vorbis or AAC ? All of them have (most of the time) better compression ratio (and better quality) than MP3. Is it for compatibility reason ?~

edit: Ah, I missed the ipod nano part

▲sandreas 3 hours ago

Just compatibility and "high enough" quality. Works in my car, on my iPod, on my Phone, on my kitchen radio and is the most common format in general.

▲nani8ot 7 hours ago

iPod Nano 7th gen. does support AAC (AIFF & WAV too).

▲kevin_thibedeau 22 hours ago

There's always going to be outliers but I find MusicBrainz pretty useful. I note that a lot of CD-text has poor application of title capitalization and MB usually has it in a more rational form. My ripping system presents a choice when both are available and I usually pick MB. There's also the benefit that the MB database is Unicode and CD-text is whatever the authoring tool used which is usually CP1252 but sometimes not.

▲Asmod4n 23 hours ago

CD Text is a thing, sadly no major label is using it anymore to embed metadata into their records so such a thing like MusicBrainz wouldn't be needed.

Sony was a big supporter of it ~25 years ago.

▲trentnix 19 hours ago

For the younger crowd: fancy head units (that's what we called the essential aftermarket CD player/receiver in the dash of your vehicle) would show you CD Text with artist, album, and track name. It would melt the brains of your friends when the name of the song that was playing would scroll by on an old-school, single- or multi-line LCD display. It was a massive flex in its day.

Good times...

▲badc0ffee 18 hours ago

My 2006 Toyota had that. What I really wanted was an aux port, or even a cassette deck I could use with an adapter to plug in my iPod. Instead I had to make do with a FM transmitter plugged into the cigarette lighter.

▲rhinoceraptor 17 hours ago

My 2017 Focus ST still has a CD player with CD text, and I actually do listen to music on CD in it, the bluetooth quality is noticeably worse for whatever reason. I got my first iPod in about 2007 in middle school, and I only ever had about 10-20 CDs growing up, but I started getting into CDs about a year ago. It seems like there is a minor resurgence now that vinyl is expensive, since CDs still cost the same as they ever did, and a lot of them are cheaper even without inflation. I picked up a copy of Pretty Hate Machine at a Walmart for $8 the other day.

▲asciimov 16 hours ago

Of course Sony was, because they own the patent for it.

The reason other labels, and most cd units, don’t use CD-Text is companies don’t want to pay for the license.

▲dylan604 21 hours ago

[flagged]

▲Henchman21 21 hours ago

When I was building out infrastructure to support streaming at Sony Music Entertainment, it was well known that interns would input the metadata. Typos were rife and genres? Made up out of whole cloth.

It feels safe to assume that the situation has improved since then, but I doubt seriously we’ll ever be free of typos ;)

▲JohnFen 17 hours ago

> genres? Made up out of whole cloth.

The problem with genre remains entirely unsolved across the board. The solution I use in my collection is to do what everyone else seems to do: make them up out of whole cloth. Because I'm the only one making them up, it means my labeling is at least internally consistent.

▲TylerE 8 hours ago

The biggest issue with genres is most databases treating them as one to one rather than one to many.

▲JohnFen 1 hour ago

That's a real issue. I think the biggest issue with genre, though, is that even if people agree on a list of possible genre labels, there is often disagreement about what music belongs in which genre.

This isn't a new problem at all. Even music labels often disagree. Back when record stores were a thing, it was pretty common for different stores to categorize the same albums differently in terms of genre. I think the only way to avoid it is to stick to very, very broad categories. "Rock", for instance, covers an amazingly broad set of styles.

▲Henchman21 15 hours ago

I will admit that I do precisely the same with my collection! But I truly felt that those interns should’ve received a list to choose from, not an open text field.

▲lloydatkinson 21 hours ago

It's sad Sony put the effort into writing rootkits for music CD's but did nothing to automate, flag, fix typos for metadata...

▲mxuribe 18 hours ago

I remember the Sony rootkits...Since then and to this day, i avoid buying anything related to Sony as best i can. Funny thing is, folks who know me know that i am not the kind of person who holds a grudge....but something about that rootkit event really brought the ire in me....one of the extremely few times where i held a grudge. So, i avoid Sony and go on with my life.

I also stop buying at other companies...but for other companies for some reason i don;'t hold onto the ire...i just stop buying from them, and quietly move on...but Sony....i don't get it, but the dislike is crazy.

▲Henchman21 17 hours ago

I recall a meeting where my team was asked to do some technical legwork for the implementation. To his credit, my boss stood up, said some words about ethics, and led our team out the door. It wasn’t the entire org… just the music business folks as I recall. I left shortly thereafter.

▲qingcharles 16 hours ago

I was doing digital ingestion for Sony in Europe and they sent us all those hobbled "unrippable" CDs and asked us to rip them for streaming. They were kinda embarrassed about it.

▲Henchman21 15 hours ago

Man, I bet you guys went through a couple boxes of sharpies getting those ready to rip! :)

▲mxuribe 16 hours ago

I highly commend you, your boss, and any others who stood up or otherwise rebelled against the despicable Sony leaders who wanted this to be done. I can only imagine that it would not have been easy. My appreciation goes out to you, your boss, and the rest of the team...and i only wish there were more folks like you in the world! For that, thank you sincerely!

▲Henchman21 54 minutes ago

All that praise goes to my old boss; the rest of us were a bit shocked we’d been asked such a thing! But he was already saying no before I’d even processed the request. He succeeded in delaying that “project” several years — I left in ‘04. IIRC this became public late ‘05-early ‘06?

▲Henchman21 19 hours ago

Agreed. I could say tons here, but it’ll suffice to say that I am wildly happy I no longer work there!

▲piperswe 23 hours ago

> Edits on MusicBrainz spend 7 days in limbo after they're created

Not all edits, just major ones (e.g. name changes). Minor edits usually get auto-accepted.

▲Avamander 21 hours ago

Faster if someone votes on the edit, which you can request on their IRC/Discord/Discourse if there's a need (like larger or dependant edits).

▲amiga386 22 hours ago

And just so people know, their edits were applied in March this year...

Edit #122458416 - Edit medium Vote tally: 0 yes : 0 no Status: Applied Opened: 2025-02-24 00:02 UTC Closed: 2025-03-03 01:00 UTC For quicker closing: 3 unanimous votes If no votes cast: Accept upon closing

▲infl8ed 10 hours ago

Actually, and quite interestingly, it looks like their second edit (to separate the tracks) failed: https://musicbrainz.org/edit/122458694 Status: Failed dependency This edit failed either because an entity it was modifying no longer exists, or the entity can not be modified in this manner anymore.

Clicking through to the CD release we can see that it indeed still has those two tracks combined https://musicbrainz.org/release/af4dc096-65d2-4cc5-9e0c-176d...

▲egypturnash 23 hours ago

Damn, MusicBrainz is still running?

"MusicBrainz is operated by the MetaBrainz Foundation, a California based 501(c)(3) tax-exempt non-profit corporation dedicated to keeping MusicBrainz free and open source." - the gloriously retro-looking front page

▲piperswe 23 hours ago

Still running and still doing great! Some of us still curate a local music library instead of streaming ;)

▲egypturnash 19 hours ago

I curate my own library too but it's pretty much all off of Bandcamp. I don't even own a CD drive I could rip with any more.

▲pavon 19 hours ago

Even with digital releases, MusicBrainz often has more detailed metadata than the original files. And if you have a mixed library of rips and digital purchases, it is nice to use a tagger like Picard to enforce consistent directory structure and filenaming.

▲masklinn 22 hours ago

You can curate a music library without ripping CDs tho.

▲JohnFen 22 hours ago

Depends on your musical tastes. A good 25% of the music in my library is not available in any form other than used CDs.

▲dwedge 21 hours ago

What kind of music?

▲mtillman 21 hours ago

Not sure about OP but I have all manner of blues and jazz recordings unavailable via streaming. There are also lots of obscure Japanese game and rock recordings that aren't in Apple or Spotify though to Spotify's credit, they have a lot of game content. Streaming is mostly in service of licenses and margins which as a shareholder, that makes sense to me.

▲seba_dos1 2 hours ago

Even local super popular rock bands from 80s don't always have their entire catalog available on streaming services, and solo endeavors of their musicians are often nowhere to be seen there.

▲shermantanktop 13 hours ago

People seem to assume that any decent creative output always gets carried forward to the next form of media tech. But there are 78s that didn’t make to LP, much less anything after that.

▲mtillman 11 hours ago

Another example: Nick and Nora pre code films weren’t on Netflix the last time I looked.

▲JohnFen 21 hours ago

A wide range, actually. It's more about the time period and artists than musical style. If it's earlier than the 90s and/or from an artist who wasn't big on the charts, it gets more likely that they're not available except on used CD.

In that sense, the depth and variety of good music that is available has been shrinking for a long while now. The advent of streaming seems to have made it worse.

▲PaulDavisThe1st 16 hours ago

By contrast, before I got rid of almost all my vinyl, one particular sub-collection that I had was about 200 12" singles from the London club scene in 1981-1985. Almost none of the tracks ever appeared on CD or were ever released digitally.

All of them were available on youtube, even the whitelabel DJ-only releases!

▲ssl-3 6 hours ago

I can curate my own library of bookmarks within [some other body's music library] without CDs; of course I can.

I can do that with iTunes or Spotify or Tidal or Amazon Music or whatever else.

But none of these bookmarks are necessarily related to my music. They are only just bookmarks that refer to music that might exist within the libraries that these bodies provide.

And while all of these libraries are certainly quite vast, there's a fuckton of (published!) music that these commercial libraries do not provide.

▲ZeroGravitas 22 hours ago

MusicBrainz has (or at least had) an acoustic fingerprint system for processing audio files too.

▲Avamander 21 hours ago

This is the part that tends to have the most mistakes, if used. It's generally better to provide minimal info manually if the CD wasn't identified by its ID.

▲piperswe 20 hours ago

Indeed! About half of my new music acquisition is on CD, the other half is Bandcamp/Qobuz/7Digital.

▲OkayPhysicist 22 hours ago

Seeing a Mastodon link on a clearly hand-written HTML site is neat.

▲cloud8421 21 hours ago

I use MusicBrainz and donate every month - yeah data is not perfect, but you can go and fix it yourself if needed, and the UI is extremely functional without any frills.

▲vkaku 7 hours ago

Perhaps I should create an overlay for MusicBrainz with sub-minute lag called ZombieBrainz.

If you own a CD and send an edit with a $5 donation, it goes on volatile and nightly; It can go to beta instantly for $100 donations and if not it'll have to be flagged for violations. If it needs to happen instantly on stable, $10000 (generous patron tier, where I will write a blog post for this entry as well) else get to it in 3 months.

▲alexchantavy 14 hours ago

Wait so back in the day I remember Winamp let you configure a CDDB thing and it connected to something called.. Gracenote? (Am I remembering that correctly?) iTunes desktop at some point used to handle this all for you and I assumed it was pulling from those sources under the hood. Where did MusicBrainz come from?

▲CharlesW 13 hours ago

https://en.wikipedia.org/wiki/MusicBrainz

”MusicBrainz was founded in response to the restrictions placed on the Compact Disc Database (CDDB), a database for software applications to look up audio CD information on the Internet. MusicBrainz has expanded its goals to reach beyond a CD metadata (information about the performers, artists, songwriters, etc.) storehouse to become a structured online database for music.”

More detail here: https://courses.cs.umbc.edu/771/papers/ieeeIntelligentSystem...

▲shermantanktop 13 hours ago

Gracenote is alive and well, and mostly supporting the video entertainment industry I think but with forays into adtech and other such schemes.

▲Sniffnoy 21 hours ago

Man, I thought this was going to be about a decoding tool that had some edge case incorrect, but instead it was just about incorrect entries in a database that was used in place of actually decoding...

▲0points 12 hours ago

Well written. I am dealing with some similarities setting up a jellyfin server and the episode data of some series are rather incomplete.

So I been contributing to tmdb for the last half year or so :)

▲rconti 20 hours ago

> Aside from some audio tracks and a table of contents over those tracks, very little extra information is included on a disk - you've pretty much only got the artist name, album name and track names actually burned into the disk.

Huh, I actually didn't think there was any metadata at all.

▲KwanEsq 17 hours ago

Yeah audio CDs do (or at least can) carry those bare bones of metadata, which can be used by some CD players with built-in displays to display the currently playing track title etc.

It's defined by the CD-Text extension[0] to the Red Book standard.

I think classical releases probably make greater use of it to encode things like composer and arranger, since they are more important to that audience, but for the average popular music release you're only going to get the artist and title, and maybe the ISRC that few are going to care about/display anyway.

[0] https://en.wikipedia.org/wiki/CD-Text

▲dhosek 20 hours ago

Yeah he goes on to talk about an external data source for metadata, so this statement is, as far as I know, wrong, even by the standard of what’s in this article.

▲rconti 20 hours ago

no, there's a part about it later, assuming we can take their word for it: (ugh, HN formatting is the _worst_)

------

Taking a look at the metadata embedded into the disk itself, we can see that track 6 is actually titled "Don't Need a Reason" on there:

FILE "./06. Finish Ticket - Nothing Coming Soon.flac" WAVE

  TRACK 06 AUDIO

    TITLE "Don't Need A Reason"

    ISRC USDPK2300133

    INDEX 01 00:00:00

▲dhosek 20 hours ago

I had always thought that the odds of doubled discs based on the TOC were unlikely, but it turns out that with discs with fewer tracks (≤4 or so), you can get duplicates quite easily.

▲devmor 20 hours ago

I wonder if this explains why som EPs I have received as ZIPs from friends get tagged incorrectly in programs like Jellyfin.

▲KwanEsq 21 hours ago

Huh, once I saw the image with the discrepancies I immediately assumed 'ah, "Nothing Coming Soon" must be in the pre-gap of "Don't Need a Reason", especially with that track length, and the rip combined that into one music file', but no, turns out it just isn't defined in the disc metadata at all. Wonder if that's a (mastering?) error, given that the TITLE metadata doesn't even include it.

▲b0a04gl 7 hours ago

that part in the blog where OP mentions CDs don’t store song names at all hit me hard. all these years i thought my old sony player was just lazy, turns out the format never even tried. whole childhood spent memorizing track numbers just to play one damn song.

▲dogman1050 15 hours ago

I've ripped hundreds of CDs and the metadata is usually ok on commercial discs. When ripping CDs I created from LP rips, I use Mp3tag to make it right.

▲JohnFen 23 hours ago

MusicBrainz and CDDB have become error-ridden enough that I've essentially stopped bothering with them and have switched back to just entering the information manually.

▲dawnerd 23 hours ago

It's worse if you're ripping foreign audio. I got a bunch of discs from Japan which I would assume, being Japan and all, there would be excellent data online. Wrong. Every single album got matched to something else.

Even accurip was incorrect. I pretty much don't trust any of the online data sources anymore and just manually enter meta.

And don't do what I did... don't just lets beets run unattended. What a pain that was.

▲qingcharles 16 hours ago

I was doing audio + metadata ingestion for the major labels and they sent us a truck load of East Asian CDs of different languages, and here's me with a team of poor minimum wage high school grads looking at me all crazy.

▲JohnFen 23 hours ago

Yes, you're right. Also, with obscure or rare CDs. If they're in the databases at all, the odds are better than 50% that the data is incorrect to some degree, or they are confused with completely different albums.

▲jeffbee 23 hours ago

Isn't that still a labor-saving starting point?

▲jandrese 22 hours ago

Depends how long it takes you to figure out what the problems are and fixing them.

Debugging is usually harder than coding, and the amount of data we are talking about is fairly small. Just typing it in could easily be faster.

▲setr 23 hours ago

the problem with false positives is that a single instance means you have to review every record meticulously, because you have no idea where the system has lied to you, or how many times (because the system itself doesn't). If you're going to review everything anyways, it's often better to simply be slow and correct to begin with rather than diff and correct every item.

this is why it's usually better to be overaggressive with saying "I don't know" rather than crossing your fingers and shitting out an answer and hoping you get away with it.

▲dylan604 21 hours ago

When did we switch the conversation to LLM issues? =)

One of the devs for a company I used to work shocked me when he said "bad data is better than no data" when inquiring about why the input field was limited to a drop down of pre-filled values that were irrelevant with no way of filling in correct data. At that point, I just felt the entire database was suspect

▲pavon 19 hours ago

It depends. I'd like to argue that you have to enter the information one way or another, why not share it and save others the work in the future, but in reality it is often quite a bit slower. MusicBrainz likes to collect more information than a normal CD riper would ask for, with more pages to click-through, so that is a bit slower. However, the main annoyance is when you have to make a correction that isn't auto approved, and then you have to wait 7 days before your tagger/ripper software will see changes you made. I wish there was a better workflow to tell Picard to use a pending edit[1].

I still always use MusicBrainz, and enjoy contributing to it, but more like others enjoy contributing to Wikipedia, rather than as an efficiency boost.

[1]https://tickets.metabrainz.org/browse/PICARD-1278

▲JohnFen 23 hours ago

Often not, because it's less effort to type the information in fresh than to review and edit the existing information.

I'm not saying the services are always overly incorrect, just that they're incorrect often enough that the path of least resistance was to stop using them.

▲dylan604 21 hours ago

Plus, it gave me something to do while the CD was importing rather than just pushing into the background while I started working on something else and promptly forget about the import.

▲ 23 hours ago

▲lksaar 23 hours ago

your best shout for jp cds is hoping someone added them on discogs

▲bananalychee 18 hours ago

I think about half of the Japanese albums I tag have a mistake of some sort on Discogs, such as wrong okurigana or kanji usage. I've corrected some of them myself, but it happens so often that I've mostly given up. In the end it's faster to transcribe from the back cover.

▲GauntletWizard 23 hours ago

I just ripped a small collection (only ~200 discs), and I encountered all of the problems that have been complained about in this thread. I still used Musicbrainz, because it was easier for me to double-check and fix the entries in their DB than to manually type all the data myself.

When bandcamp releases were available but nothing was in the database, I found it quick and simple to copy+paste the track listing into MB and create a new release. Combining it with the TOC I'd already been searching for, I got perfect rips every time without much issue.

Even with a significant amount of time double checking and fixing the metadata, I consider it a good use of time. I was not simply ripping my CDs, I was helping maintain the historical record.

▲mayneack 21 hours ago

There are userscripts to automatically do this from sources like bandcamp: https://musicbrainz.org/doc/Guides/Userscripts

▲JohnFen 20 hours ago

> I was not simply ripping my CDs, I was helping maintain the historical record.

That was how I felt about it in the earlier days, when I'd actively participate in updating/correcting the databases. I stopped feeling that way years ago, though. Right or wrong, it felt like a losing battle as so many corrections were never actually adopted.

▲cloud8421 21 hours ago

> Even with a significant amount of time double checking and fixing the metadata, I consider it a good use of time. I was not simply ripping my CDs, I was helping maintain the historical record.

This is the spirit - I've started doing the same for releases that don't appear in MusicBrainz and it feels great knowing that I'm not just doing this for myself.

▲al_borland 22 hours ago

Was there a period where it was good? I tried in back around 2001 or 2002 and it produced a mess. I swore it off and figured it wouldn’t be around long. Here we are over 20 years later hearing that it’s too error-ridden to use.

▲jandrese 22 hours ago

These days something like MusicBrainz is effectively a legacy system. So few people buy CDs anymore that there's not a lot of interest in maintaining it. It's fairly hard to even find a computer with an optical disk reader these days, especially if you are looking at laptops.

▲cloud8421 21 hours ago

Note that the scope of the project goes beyond CDs, it's a catalogue for pretty much any format where you can play music.

▲Avamander 21 hours ago

It's used as the basis in a _lot_ of places. So fixing errors fixes them in a lot of other websites (and infoboxes).

▲ 14 hours ago

▲JodieBenitez 22 hours ago

Never worked fine for me, at least not fine enough to trust it.

▲riansanderson 22 hours ago

tangentially related- does anyone have a good recommendation on an external CD drive that works well with macOS and has a good form factor and build quality?

I have an ancient thinkpad that I use a couple of times a year _just for reading cds_ and and have considered retiring it. But all the CD drives I see on amazon look like disposable crap.

▲dawnerd 22 hours ago

Pick up an internal drive and get a good enclosure. Way better than any of the external junk on Amazon. Better yet get one of the LG bluray drives that support ripping 4k discs. Might need to flash the firmware. That’s what I use and it’s great and really fast for plain cds as a bonus.

▲aspenmayer 18 hours ago

> Might need to flash the firmware.

I’m a fan of LibreDrive, but have you heard about any similar firmwares for this purpose?

More info about LibreDrive on the forum that hosts discussion about it and tools that it works with:

https://forum.makemkv.com/forum/viewtopic.php?t=18856

▲dawnerd 15 hours ago

There was a thread for the drives model which I don’t have near me at the moment. Was a kinda sketchy windows app but now makemkv shows libredrive.

▲TheAmazingRace 22 hours ago

Anything made by Pioneer these days is a good choice. That said, Pioneer just recently exited the optical disc drive market a month or so ago, so you'll want to pick up a drive while you still can. They tend to be pricier than your generic external disc drive, but they are dead reliable, and fully compatible with software like EAC and XLD.

I have the Pioneer BDR-XS07S slot loading external BluRay burner drive and it does a great job ripping audio CDs.

▲dhosek 20 hours ago

I bought this Pioneer drive

http://www.amazon.com/exec/obidos/ASIN/B0BN66KFV1/donhosek

last year after having two consecutive drives crap out on me with both not wanting to eject discs or acknowledge discs that were in the drive and it has worked perfectly for me for this year. It has my strong endorsement..

▲echelon_musk 22 hours ago

When I wanted one for ripping music CDs to my M1 Mac I bought the cheapest used USB to CD/DVD drive on eBay. It's a LITE-ON eUAU108 and hasn't failed me.

▲eisa01 21 hours ago

I tried buying a noname drive from AliExpress, and the drive wouldn't rip correctly with XLD...

You could rather salvage the drive from an old MacBook, works great with a cheap adapter

▲Synaesthesia 22 hours ago

The Apple superdrive

▲giantrobot 18 hours ago

While I've ripped hundreds of discs with mine, they do have some downsides. It can be a bitch and a half to get a disc out if it can't be read properly. Even drutil wouldn't eject such discs.

There's also no way to use mini CD/DVDs with them. Not that those were ever super popular but if you have any it's an annoyance.

I replaced my SuperDrive with an 5.25" internal drive in an external powered enclosure. I can always get unreadable discs out easily, have no problem with mini discs, and I'm not stuck with an extremely short USB cable.

A SuperDrive isn't a bad option but there's better available.

▲k__ 16 hours ago

I remember having a game CD+R.

It had scratches and even holes, but somehow it worked, lol.

▲TylerE 8 hours ago

Scratches effect much less than most people think, as long as they're superficial. (For instance, dings from the case are likely fine, but run the tip of a key across the one...)

Contrary to what most people expect, the data pits on a CD are much closer to the label side than the shiny side - the bottom of the disc is a clear plastic layer that as far as the optics of the drive are concerned are out of focus.

▲dd_xplore 17 hours ago

I see a lot of praise for MusicBrainz, is it really that good?

▲at_a_remove 18 hours ago

I am keeping an eye on this thread, as I plan to eventually rip my somewhat large collection, but would prefer to do it just the one time.

Exact Audio Copy, the author seems to have moved on to other interests, which is a shame because I was looking for something compatible with an autoloader. And it looks like dbpoweramp is the only one left in that arena.

I am allllll about the metadata. Also, a thumbnail, synced lyrics if they could be found, custom metadata for hyperlinks back to entries on Discogs and MusicBrainz, perhaps some ReplayGain values in fields on the FLAC, depending on my MP3 processing case ... but I have so many unanswered questions.

▲MyPasswordSucks 13 hours ago

> Exact Audio Copy, the author seems to have moved on to other interests, which is a shame because I was looking for something compatible with an autoloader.

Nah, it's mostly just reached the stage where there's nothing left to do - all the "objective" stuff works as it should, and any feature adds would be a pretty heavy undertaking. It was updated a little less than a year ago, and when I contacted the author he was very responsive.

Would it be nice to have a keyboard shortcut for proper [1] cuesheet creation (ironically, all the options except the proper one have keyboard shortcuts)? Yeah, but I've learned to live with it. Would it be nice to have super-duper tagging options? I dunno, from where I'm sitting, it seems like it'd just be duplicating a bunch of foobar2000 features for negligible gain.

[1] Because nobody wants a .FLAC that starts with a few seconds of silence, inter-track gaps need to be appended to the end of the previous track, which is not how Red Book audio handles it, and means that the "proper" cuesheet format is technically a non-compliant cuesheet.