[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Help-bash] bash suitable for parsing big files?
From: |
Matthew Cengia |
Subject: |
Re: [Help-bash] bash suitable for parsing big files? |
Date: |
Fri, 13 Sep 2013 14:03:29 +1000 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On 2013-09-13 02:55, adrelanos wrote:
> Dennis Williamson:
[...]
> line="firstone secondone thirdone"
>
> How can I get "firstone" into variable "first"? I am using awk.
>
> first="$(echo "$line" | awk '{print $1}')"
read -r first second third _ <<< "$line"
Or:
read -ra arr <<< "$line"
echo "${arr[0]}"
>
> The recommendation to use awk came from search engines. I wouldn't know
> how to do it without external utility, never found answers to do it in
> pure bash. Until now, it worked well, but if you have an idea how to do
> it pure bash, that'd be great.
Google is not your friend in this case; too many examples of bad code. I
strongly recommend dropping into the #bash channel on Freenode for this
sort of question.
>
> > Consider that many things like awk and
> > grep iterate over the lines in a file for free.
>
> I don't understand. Please elaborate.
When Awk receives input, and that input is multiple lines long, it'll
*automatically* iterate over each line sequentially by default:
address@hidden:tmp$ printf "%s\n" a b c | awk '{ printf("Line %s: %s\n", NR,
$0); }'
Line 1: a
Line 2: b
Line 3: c
This means everything is done in a single Awk call, which eliminates
thousand of fork/exec calls and runs lots faster than iterating with a
'while' or 'for' loop in Bash then processing each line in Awk. Either
do it all in Bash, or do it all in Awk. Avoid mixing if at all possible.
>
> > Ultimately, it comes down to "What are you really trying to do?"
>
> Imagine you are using $linux-distribution on hdd and you want to check
> the integrity of your system. You're booting from USB or DVD and which
> you assume the clean of backdoors while you're not so sure your hdd
> contains a backdoor.
>
> The script I am writing looks what files are installed, downloads the
> package from $linux-distribution's repositories and compares them with
> the ones on the disk. Finally reports which were modified and which ones
> could not be verified (because they are not in a package, auto generated
> files, etc.). [And more.] I am doing such a thing, just not to verify a
> hdd, but to verify a virtual machine image.
>
> Code:
> https://github.com/Whonix/Whonix/blob/master/release/verify_build#L187
>
> Function:
> parse_dpkg_status_file
>
This is what debsums is for: http://packages.debian.org/search?keywords=debsums
--
Regards,
Matthew Cengia
signature.asc
Description: Digital signature
- [Help-bash] bash suitable for parsing big files?, adrelanos, 2013/09/12
- Re: [Help-bash] bash suitable for parsing big files?, adrelanos, 2013/09/14
- Re: [Help-bash] bash suitable for parsing big files?, Chris Down, 2013/09/15
- Re: [Help-bash] bash suitable for parsing big files?, Greg Wooledge, 2013/09/13
- Re: [Help-bash] bash suitable for parsing big files?, adrelanos, 2013/09/14
- Re: [Help-bash] bash suitable for parsing big files?, Chris Down, 2013/09/15
- Re: [Help-bash] bash suitable for parsing big files?, adrelanos, 2013/09/15