How to improve performance in generating MAC addresses with od?

Question

In my script I need to randomly generate MAC addresses. The code below is from a larger script extracted, that's why the MAC address calculation is in a separate function.

This works fine with the code below. Although when I execute the script it slows heavily down after a number of generated addresses.

How can I improve the speed of generating valid MAC addresses?

#!/bin/bash

devicesCSVMacAddress="55:2d:fa:07" # <- fake MAC address prefix
devicesCSVFile=''

function mac_address() {
    line=''


    # ****************
    # This line below when I calculate a random mac address ending seems to be slow
    line+=$devicesCSVMacAddress$(od -txC -An -N2 /dev/random|tr \  :)
    # ****************


    devicesCSVFile+=$line'\n'
    date
}

for (( i=0; i<100; i++ ))
do
    mac_address
    echo $i
done

echo -e $devicesCSVFile > devices.csv

I used the od tool like it described in this answer: How to generate a valid random MAC Address with bash shell.

@OP, might I suggest explaining your issue more concretely (by giving example input/output, or snippets of your script) to understand more precisely what your particular problem is. As is, your question might be tagged as too broad/unsure what you're asking. — Valentin B., Nov 14 '16 at 12:49
@ValentinB. Thanks for your feedback. I've added some more explanation. It's really not much I do in the script. I'm just wondering if it's really "good practice" or if there is "good way of doing text processing"? — Bruno Bieri, Nov 14 '16 at 13:03
as don_cristti said. There isn't a "right" way to do something in shell. Tell us your exact problem and we can give you a more precise solution. A solution can very depending on the problem. — NinjaGaiden, Nov 14 '16 at 13:17
@OP I concur with don_crissti, using the right tool for each job and avoiding loops sounds like what you call "good practice". However there is a conceptual blurr in your question: the link you provided (while being a very informative topic) is about looping on the contents of a file. That can be avoided in most cases. In your case looping on a list of files might be the only answer, depending on the tools you invoke in your functions (they might be able to handle several files at a time, or not). — Valentin B., Nov 14 '16 at 13:18
@ValentinB. you're absolutely right. I confused the two things. I'm not actually reading the contents but only loop over the names. Using loop seemed to me the problem and I found this answer. But actually it's not the root cause. I've added more information what exactly is slow in my script. — Bruno Bieri, Nov 14 '16 at 13:32
Repeatedly reading /dev/random slows because you exhaust the available entropy; see the manpage for random(4). MACs are public on the network so making them cryptorandom is a waste of time; use /dev/urandom or even bash $RANDOM. In fact they don't need to be random at all, if you have a convenient way of making them unique. — dave_thompson_085, Nov 15 '16 at 10:36
/dev/urandom does the trick and is sufficient in my case. Thanks all for their contributions! — Bruno Bieri, Nov 16 '16 at 08:39
Be aware that your set of 100 addresses may contain duplicates. There is (and can be) nothing about a random number stream that precludes the same number occurring multiple times in any given data set size. { 1 1 1 } is just as random as { 13 55 4 }. — Chris Davies, Nov 16 '16 at 09:40

score 3 · Accepted Answer · edited Apr 24 '18 at 20:44

3

Use /dev/urandom! There is almost no good reason to use /dev/random instead of /dev/urandom -- see Myths about urandom or When to use /dev/random vs /dev/urandom -- and certainly not when you are going to publish the generated numbers all over the place.

/dev/random consumes entropy, blocks and waits if not enough of it is available. /dev/urandom never blocks.

edited Apr 24 '18 at 20:44

U880D

1,146

answered Nov 15 '16 at 11:54

AlexP

10,455

score 2 · Answer 2 · answered Nov 15 '16 at 11:57

This will avoid duplication while keeping the randomness you desire

prefix='55:2a:fa:07'
while :
do
    echo $prefix$(od -txC -An -N2 /dev/urandom | tr ' ' :)
done |
    awk '!h[$0]++ {print $0; ONR++} ONR>100 {exit}' >devices.csv

The awk construct keeps track of the lines it's already seen and outputs only those that have not been seen before. When it has output 100 lines it exits and the loop stops.

score 2 · Answer 3 · answered Nov 15 '16 at 12:18

With zsh:

#! /bin/zsh -
prefix=55:2d:fa:07

(){
  local LC_ALL=C
  IFS= read -ru0 -k2 a < /dev/urandom
  printf '%s:%02x:%02x\n' "$prefix" \'${^(s::)a}
}

Would only use built-ins. The zsh specific features in there are:

it can cope with NUL bytes in its variables (contrary to all other shells)
(){...}: anonymous function (here used for the local scope for LC_ALL).
read -k2: read two characters (here bytes with LC_ALL=C). ksh93 and bash now have -N for that. You need -u0 (same as ksh) for -k to not mean keyboard keys.
(s::) parameter expansion flag to split (here on the empty string so splits into the individual characters).
$^array: distributes that array expansion, so it becomes 'x 'y.
the printf command is standard here, including the 'x part to get the codepoint value for a character.

How to improve performance in generating MAC addresses with od?

3 Answers3