DES contains certain bit permutations which are difficult to implement in software (the P box, IP and FP, and PC2). These can be done with a pattern of shifts and XORs, but it is substantially more complex and less efficient to implement than in hardware, and as a result, the algorithm tends to be slow when implemented on a general-purpose computer.
Typically, symmetric-key primitives that are designed for software use operations that are easily implementable as general-purpose instructions, such as addition and subtraction, integer multiplication, bitwise operations (including rotations), and table lookups on 8-, 16-, 32-, or 64-bit integers. Most of these can be implemented efficiently on all processors with high performance. People usually will not adopt algorithms which are slow and inefficient, so successful algorithms must work within the constraints of real hardware.
Implementations which are suited for hardware may use constructions such as LFSRs or NLFSRs, bit permutations, or other operations involving combinations of single bits or rearrangements of existing bits. These are not efficient in software because most general-purpose computers don't provide for these operations and thus they must be emulated inefficiently.