My take on what makes a good CW transceiver:
1) Good sensitivity. (so you can hear the weak ones). How much is enough varies with the band and your location.
2) Good selectivity. This doesn't just mean sharp filters or DSP; it means the filter characteristics and placement are optimized for CW.
3) Good "dynamic range". This is actually several characteristics (blocking dynamic range, third-order IMD, phase noise) and I won't give a detailed description.
What matters is that 1), 2) and 3) are put together so as to let you hear the weak ones clearly on a band that's full of strong signals.
4) A nice big tuning knob (2-3" diameter), a slow tuning rate (5 kHz/turn or less) and an easy-to-read dial (doesn't have to be digital, but shouldn't require squinting or a maginifier). The dial drive should have a nice feel and not be tiring to use.
5) Good stability. Doesn't have to be NIST-standard but don't want to go chasing up and down the band.
6) Good clean audio, including the sidetone. While it doesn't have to be hi-fi, low-distortion audio can make the difference between a set that's easy to listen to for hours, and one that's tiring.
7) Defeatable AGC and a good RF gain control. (AGC isn't really needed and often makes things worse).

RIT and split operation.
9) Good clean keying at all speeds. This means no clicks, no chirps, no burps, no short-first-dit, etc. It also means very little tx delay (see below in QSK discussion).
10) Easy to use controls. (Shouldn't have to hunt through menus to turn off the AGC or change the keyer speed, etc.)
The exact choices of a lot of these things is up to the operator, and you have to know your own likes and dislikes. For example, some ops aren't bothered by a tuning rate of 20-25 kHz per turn, which for me is way too fast. OTOH I don't use AGC, which others consider essential.
---
Now about QSK/full-break-in:
First off, understand that true QSK/full-break-in means you can hear between dits and dahs, even at high speeds. IOW, whenever the key is up, you hear the frequency, right down to the noise. (Nobody can really hear the frequency with the key down unless the transmitting and receiving antennas are well separated and the tx and rx are not on the same frequency).
There are a number of ways to achieve QSK/full-break-in. All of them can be good; what matters is the implementation.
A lot of rigs say they have "break-in" or "semi-break-in" or similar, but what they really have is a form of "keyed-VOX" which lets you hear between words, if it's adjusted that way. A nice feature, but it's not full-break-in. It's just a convenience feature so you don't have to throw a switch.
Whether or not QSK/full-break-in is worthwhile depends on the operator and the kind of operation. Some folks find it hard to copy when they can hear the frequency between dits and dahs, others like it.
Some folks' operation is such that QSK makes a big difference (traffic handling, S&P contesting & DXing) while others don't need it. The ham whose top speed is 20 wpm on a good day may find that a certain rig's QSK is excellent, while another ham who cruises on the high side of 40 per thinks the same rig has terrible QSK.
In theory, there is no reason a transceiver can't have QSK that's every bit as good as separate tx and rx. And some transceivers like the Elecraft K2 are very good.
But not all transceivers do QSK well. Here's one reason why:
Consider the classic separate tx-rx setup that was "state of the art" 40-50 years ago.
Such a setup would consist:
1) Transmitter that had very clean, no-delay keying and was connected to the antenna at all times
2) Receiver that had very fast mute and unmute properties
3) Electronic or relay TR switch that could operate at very high speeds.
There was also a control system (often just a high speed relay) to mute the receiver when the key was down.
With such a system, you press the key and the transmitter emits RF almost instantly. The receiver also mutes almost instantly. Release the key and the transmitter goes quiet while the receiver comes to life. The electronic TR switch keeps the transmitter's RF out of the receiver front end.
Total R-to-T and T-to-R delays in such setups could be held down to a couple of milliseconds, far shorter than a dit, even at high speeds.
Because the tx and rx are almost totally independent, the changeover can happen very fast.
But with transceivers there's often more to be done - it depends on the exact design of the rig. For example, most transceivers use some stages for both transmit and receive, and their changeover takes time.
This time factor is usually most pronounced in the oscillators, which often have to change frequency between T and R. So when you press the key, the transmitter cannot be allowed to emit RF right away; there has to be time allowed for the oscillators to QSY to the new frequency before the transmitter actually keys. When the key is released, the reverse sequence has to happen.
Depending on things such as PLL lockup time, these added delays can be considerable, particularly at high speeds. Look at the high-speed keying pictures in QST Product Reviews and you'll see that some rigs have considerable differences between the key-closure waveform and the transmitted-RF waveform.
Some numbers to illustrate the problem:
When typical plain-language English is sent by Morse Code, the length of a dit (the key-down time) in milliseconds is about 1200/WPM. So a dit at 12 wpm is 100 milliseconds long, while a dit at 60 wpm is only 20 milliseconds long.
If the R-to-T and T-to-R delays can be kept down to a couple of milliseconds, the system will be good at almost all typical amateur code speeds.
But if there's a delay of, say, 20 milliseconds from R-to-T and T-to-R, the picture is quite different. At low speeds the system works fine, but by the time you get to 30 wpm the transition times become a major factor. At 60 wpm you cannot hear between dits at all. During a string of dits, the actual RF is being transmitted during the time the key is up, because the delays are the same as a dit length.
And 20 milliseconds is not a lot of time.
There *are* ways to deal with these issues. The main point is that some rigs deal with them much better than others.
73 de Jim, N2EY