14

Is there a convenient way to convert the output of the *nix command tree to JSON format? My goal is to convert something like:

.
|-- dir1
|   |-- dirA
|   |   |-- dirAA
|   |   `-- dirBB
|   `-- dirB
`-- dir2
    |-- dirA
    `-- dirB

into:

{"dir1" : [{"dirA":["dirAA", "dirAB"]}, "dirB"], "dir2": ["dirA", "dirB"]}
Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
  • 1
    @BausTheBig - I don't think you've thought this through all the way. The tree command isn't the right tool. I might be inclined to do ls -R or a find instead. – slm Sep 10 '13 at 22:12

5 Answers5

26

Version 1.7 includes support for JSON:
http://mama.indstate.edu/users/ice/tree/changes.html

Per the man page (under XML/JSON/HTML OPTIONS):

-J     Turn on JSON output. Outputs the directory tree as an JSON formatted array.

e.g.

$ tree -J                                                                                                 

/home/me/trash/tree-1.7.0
[{"type":"directory","name": ".","contents":[
    {"type":"file","name":"CHANGES"},
    {"type":"file","name":"color.c"},
    {"type":"file","name":"color.o"},
    {"type":"directory","name":"doc","contents":[
      {"type":"file","name":"tree.1"},
      {"type":"file","name":"tree.1.fr"},
      {"type":"file","name":"xml.dtd"}
    ]},
    {"type":"file","name":"hash.c"},
    {"type":"file","name":"hash.o"},
    {"type":"file","name":"html.c"},
    {"type":"file","name":"html.o"},
    {"type":"file","name":"INSTALL"},
    {"type":"file","name":"json.c"},
    {"type":"file","name":"json.o"},
    {"type":"file","name":"LICENSE"},
    {"type":"file","name":"Makefile"},
    {"type":"file","name":"README"},
    {"type":"file","name":"strverscmp.c"},
    {"type":"file","name":"TODO"},
    {"type":"file","name":"tree"},
    {"type":"file","name":"tree.c"},
    {"type":"file","name":"tree.h"},
    {"type":"file","name":"tree.o"},
    {"type":"file","name":"unix.c"},
    {"type":"file","name":"unix.o"},
    {"type":"file","name":"xml.c"},
    {"type":"file","name":"xml.o"}
  ]},
  {"type":"report","directories":1,"files":26}
]
don_crissti
  • 82,805
Sridhar Sarnobat
  • 1,802
  • 20
  • 27
7

Attempt 1

A solution using just perl, returning a simple hash of hashes structure. Before the OP clarified data format of JSON.

#! /usr/bin/perl

use File::Find; use JSON;

use strict; use warnings;

my $dirs={}; my $encoder = JSON->new->ascii->pretty;

find({wanted => &process_dir, no_chdir => 1 }, "."); print $encoder->encode($dirs);

sub process_dir { return if !-d $File::Find::name; my $ref=%$dirs; for(split(///, $File::Find::name)) { $ref->{$} = {} if(!exists $ref->{$}); $ref = $ref->{$_}; } }

File::Find module works in a similar way to the unix find command. The JSON module takes perl variables and converts them into JSON.

find({wanted => \&process_dir, no_chdir => 1 }, ".");

Will iterate down the file structure from the present working directory calling the subroutine process_dir for each file/directory under ".", and the no_chdir tell perl not to issue a chdir() for each directory it finds.

process_dir returns if the present examined file is not a directory:

return if !-d $File::Find::name;

We then grab a reference of the existing hash %$dirs into $ref, split the file path around / and loop with for adding a new hash key for each path.

Making a directory structure like slm did:

mkdir -p dir{1..5}/dir{A,B}/subdir{1..3}

The output is:

{
   "." : {
      "dir3" : {
         "dirA" : {
            "subdir2" : {},
            "subdir3" : {},
            "subdir1" : {}
         },
         "dirB" : {
            "subdir2" : {},
            "subdir3" : {},
            "subdir1" : {}
         }
      },
      "dir2" : {
         "dirA" : {
            "subdir2" : {},
            "subdir3" : {},
            "subdir1" : {}
         },
         "dirB" : {
            "subdir2" : {},
            "subdir3" : {},
            "subdir1" : {}
         }
      },
      "dir5" : {
         "dirA" : {
            "subdir2" : {},
            "subdir3" : {},
            "subdir1" : {}
         },
         "dirB" : {
            "subdir2" : {},
            "subdir3" : {},
            "subdir1" : {}
         }
      },
      "dir1" : {
         "dirA" : {
            "subdir2" : {},
            "subdir3" : {},
            "subdir1" : {}
         },
         "dirB" : {
            "subdir2" : {},
            "subdir3" : {},
            "subdir1" : {}
         }
      },
      "dir4" : {
         "dirA" : {
            "subdir2" : {},
            "subdir3" : {},
            "subdir1" : {}
         },
         "dirB" : {
            "subdir2" : {},
            "subdir3" : {},
            "subdir1" : {}
         }
      }
   }
}

Attempt 2

Okay now with different data structure...

#! /usr/bin/perl

use warnings; use strict; use JSON;

my $encoder = JSON->new->ascii->pretty; # ascii character set, pretty format my $dirs; # used to build the data structure

my $path=$ARGV[0] || '.'; # use the command line arg or working dir

Open the directory, read in the file list, grep out directories and skip '.' and '..'

and assign to @dirs

opendir(my $dh, $path) or die "can't opendir $path: $!"; my @dirs = grep { ! /^[.]{1,2}/ && -d "$path/$_" } readdir($dh); closedir($dh);

recurse the top level sub directories with the parse_dir subroutine, returning

a hash reference.

%$dirs = map { $_ => parse_dir("$path/$_") } @dirs;

print out the JSON encoding of this data structure

print $encoder->encode($dirs);

sub parse_dir { my $path = shift; # the dir we're working on

# get all sub directories (similar to above opendir/readdir calls)
opendir(my $dh, $path) or die "can't opendir $path: $!";
my @dirs = grep { ! /^[.]{1,2}/ && -d "$path/$_" } readdir($dh);
closedir($dh);

return undef if !scalar @dirs; # nothing to do here, directory empty

my $vals = [];                            # set our result to an empty array
foreach my $dir (@dirs) {                 # loop the sub directories         
    my $res = parse_dir("$path/$dir");    # recurse down each path and get results

    # does the returned value have a result, and is that result an array of at 
    # least one element, then add these results to our $vals anonymous array 
    # wrapped in a anonymous hash
    # ELSE
    # push just the name of that directory our $vals anonymous array
    push(@$vals, (defined $res and scalar @$res) ? { $dir => $res } : $dir);
}

return $vals;  # return the recursed result

}

And then running the script on the proposed directory structure...

./tree2json2.pl .
{
   "dir2" : [
      "dirB",
      "dirA"
   ],
   "dir1" : [
      "dirB",
      {
         "dirA" : [
            "dirBB",
            "dirAA"
         ]
      }
   ]
}

I found this pretty damn tricky to get right (especially given the "hash if sub directories, array if not, OH UNLESS top level, then just hashes anyway" logic). So I'd be surprised if this was something you could do with sed / awk ... but then Stephane hasn't looked at this yet I bet :)

Drav Sloan
  • 14,345
  • 4
  • 45
  • 43
5

Here is one way using Perl and the JSON perl module.

$ tree | perl -e 'use JSON; @in=grep(s/\n$//, <>); \
     print encode_json(\@in)."\n";'

Example

Create some sample data.

$ mkdir -p dir{1..5}/dir{A,B}

Here's what it looks like:

$ tree 
.
|-- dir1
|   |-- dirA
|   `-- dirB
|-- dir2
|   |-- dirA
|   `-- dirB
|-- dir3
|   |-- dirA
|   `-- dirB
|-- dir4
|   |-- dirA
|   `-- dirB
`-- dir5
    |-- dirA
    `-- dirB

15 directories, 0 files

Here's a run using the Perl command:

$ tree | perl -e 'use JSON; @in=grep(s/\n$//, <>); print encode_json(\@in)."\n";'

Which returns this output:

[".","|-- dir1","|   |-- dirA","|   `-- dirB","|-- dir2","|   |-- dirA","|   `-- dirB","|-- dir3","|   |-- dirA","|   `-- dirB","|-- dir4","|   |-- dirA","|   `-- dirB","`-- dir5","    |-- dirA","    `-- dirB","","15 directories, 0 files"]

NOTE: This is just an encapsulation of the output from tree. Not a nested hierarchy. The OP changed the question after I suggested this!

slm
  • 369,824
3

I was also searching for a way to output a linux folder / file tree to some JSON or XML file. Why not use this simple terminal command:

    tree --dirsfirst --noreport -n -X -i -s -D -f -o my.xml
--dirsfirst: list directories before files
--noreport: don't print a report at the end of the listing
-n: turn colors off
-X: XML output
-i: don't indent lines
-s  print the size in bytes of each file
-D: print the date of the last modification time
-f: print the full path of each file
-o my.xml: output file

So, just the Linux tree command, and config your own parameters. Here -X gives XML output. For me, that's OK, and I guess there's some script to convert XML to JSON.

1

This does the job. https://gist.github.com/debodirno/18a21df0511775c19de8d7ccbc99cb72

import os
import sys
import json

def tree_path_json(path):
    dir_structure = {}
    base_name = os.path.basename(os.path.realpath(path))
    if os.path.isdir(path):
        dir_structure[base_name] = [ tree_path_json(os.path.join(path, file_name))\
         for file_name in os.listdir(path) ]
    else:
        return os.path.basename(path)
    return dir_structure

if len(sys.argv) > 1:
    path = sys.argv[1]
else:
    path = '.'

print json.dumps(tree_path_json(path), indent = 4, separators = (', ', ' : '))