Spent the last few days debugging network issues at work.
Exhausting. You never get a full picture. You poke a little here, poke a little there, … Form a hypothesis and test it. Eventually, maybe, you can narrow it down a bit to some segment or even some component.
A very time consuming process. Even more so if you try not to cause downtimes for your users.
I want a magical device that allows me to look inside a cable/fibre.
But hey, at least we got rid of a bunch of Cisco switches in the process. So there’s that.
@firstname.lastname@example.org Network issues at what level? 🤔
@email@example.com In this particular case: Figuring out why a switch decided not to forward ARP broadcasts to certain switch ports. Like, you connect two devices to the switch ports 3 and 4 and they can
ping each other – but when you move one of them to port 5, no
ping anymore. So, rather low-level stuff.
Or, we have another switch that intermittently doesn’t respond on its management IP anymore (even when you connect your laptop directly to the switch with no other networking hardware in between). I still haven’t figured out this issue.
spooky action at a distance. Just remember all computing infra is rocks smashed together in a particular way to move sparkys around in the right statistical modal.
Also have you pulled out wireshark yet? 😅
had a similar problem many years ago. some lovely individual decided to create a bunch of trunk ports which had no primary vlid they had also created multiple isolated vlans with no routing between subnets.
im convinced cisco was created by sadist
@firstname.lastname@example.org with layer8 being the super glue.
@email@example.com I’ve tcpdump’ed and wireshark’ed the shit out of this. 😂 It’s not very helpful. I’d need to gain insight into the decision making of the switch itself. Why does it drop certain packets? That’s almost impossible to find out (unless it happens to be included in the switch’s logs, which it usually isn’t).
Like @firstname.lastname@example.org said, it’s usually some kind of misconfiguration. Hence you begin to dump the entire switch config into a file and then run
diff against the config of a working switch. 🤣 Sometimes this approach works, sometimes it doesn’t …
We recently changed from Cisco to MikroTik switches. At least those switches offer some kind of basic API, which means we can configure them via our config management – instead of using the switch’s web UI or SSH, like some cave men. That should make our life much easier.
Honestly, all the switches I’ve seen so far were total crap. So far, MikroTik is the best thing. Maybe there are actually good switches out there, but they probably cost a ton of money, and we can’t afford that.
@xuu Sad but true. 🤣
@email@example.com smart move from
cisco -> mikrotik they pack more bang for the buck and off a less-esoteric configuration system than cisco. the cost of course is a nice outcome. before i became involved with opnsense project i spent many years in mikrotik world. i quite enjoyed it.
cisco is mostly trash these days as their focus is a lot of consumer-grade gear which really is not amazing. they tried to do the whole cloud/sdn thing with meraki but it (much like ubiquiti) is firmware hell and full of exploits that take forever to get patched.
if your group cycles through more gear i’d suggest juniper. and yes, they can be affordable.