What's the quickest, easiest way to read the first line only from a file? I know you can use file
, but in my case there's no point in wasting the time loading the whole file.
Preferably a one-liner.
What's the quickest, easiest way to read the first line only from a file? I know you can use file
, but in my case there's no point in wasting the time loading the whole file.
Preferably a one-liner.
It happens because PHP CS Fixer modifies the file, as @LazyOne said, but you have empty value of "Output paths to refresh" so IDE cannot know about these changes.
Set the value of "Output paths to refresh" to $FileName$
- same as in arguments - to make PhpStorm aware about the changes (it depends on the "Working directory" value which has been set in Other Options - if it is set to $FileDir$ then you don't need to mention it in the paths to refresh).
I'd guess that you are on windows and that you have I: mapped to a share such as \server2files ...
If so, that's your problem. These mappings are only avaialble to the current users (eg, the admin account), not to the IUSR account that your php is probably running as (assuming IIS). Solution, don't use mappings, instead use the full 'unc' path name, ie '\serversharefolderfile.ext', also remember that the IUSR account will need access to these shares/folders/files
I very much doubt that BufferedReader
is going to cause a significant overhead. Adding your own code is likely to be at least as inefficient, and quite possibly wrong too.
For example, in the code that you've given you're calling new String(bytes)
which is always going to create a string from 1024 bytes, using the platform default encoding... not a good idea. Sure, you clear the array afterwards, but your strings are still going to contain a bunch of ' ' characters - which means a lot of wasted space, apart from anything else. You should at least restrict the portion of the byte array the string is being created from (which also means you don't need to clear the array afterwards).
Have you actually tried using BufferedReader
and found it to be too slow? You should usually write the simplest code which will meet your goals first, and then check whether it's fast enough... especially if your only reason for not doing so is an unspecified resource you "read on the internet". DO you want me to find hundreds of examples of people spouting incorrect performance suggestions? :)
As an alternative, you might want to look at Guava's overload of Files.readLines()
which takes a LineProcessor
.
As Suggested in other answers it could be a good idea to build a map of the file. The way I would do this (in pseudocode) would be:
let offset be a unsigned 64 bit int =0;
for each line in the file
read the line
write offset to a binary file (as 8 bytes rather as chars)
offset += length of line in bytes
Now you have a "Map" file that is a list of 64 bit ints (one for each line in the file). To read the map you just compute where in the map the entry for the line you desire is located:
offset = desired_line_number * 8 // where line number starts at 0
offset2 = (desired_line_number+1) * 8
data_position1 = load bytes [offset through offset + 8] as a 64bit int from map
data_position2 = load bytes [offset2 through offset2 + 8] as a 64bit int from map
data = load bytes[data_position1 through data_position2-1] as a string from data.
The idea is that you read through the data file once and record the byte offset in the file where each line starts and then store the offsets sequentially in a binary file using a fixed size integer type. The map file should then have a size of number_of_lines * sizeof(integer_type_used)
. You then just have to seek into the map file by calculating the offset of where you stored the line number offset and read that offset as well as the next lines offset. From there you have a numerical range in bytes of where your data should be located.
Example:
Data:
hellon
worldn
(n newline at end of file)
Create map.
Map: each grouping [number] will represent an 8 byte length in the file
[0][7][14]
//or in binary
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000111
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00001110
Now say I want line 2:
line offset = 2-1 * 8 // offset is 8
So since we are using a base 0 system that would be the 9th byte in the file. So out number is made up of bytes 9 - 17 which are :
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000111
//or as decimal
7
So now we know that out line should start at offset 7 in our data file (This offset is base 1, it would be 6 if we started counting at 0).
We then do the same process to get the start offset of the next line which is 14.
Finally we look up the byte range 7-14 (base 1, 6-13 base 0) and store that as a string and get worldn
.
C++ implementation:
#include <iostream>
#include <fstream>
int main(int argc, const char * argv[]) {
std::string filename = "path/to/input.txt";
std::ifstream inputFile(filename.c_str(),std::ios::binary);
std::ofstream outfile("path/to/map/file.bin",std::ios::binary|std::ios::ate);
if (!inputFile.is_open() || !outfile.is_open()) {
//use better error handling than this
throw std::runtime_error("Error opening files");
}
std::string inputData;
std::size_t offset = 0;
while(std::getline(inputFile, inputData)){
//write the offset as binary
outfile.write((const char*)&offset, sizeof(offset));
//increment the counter
offset+=inputData.length()+2;
//add one becuase getline strips the n and add one to make the index represent the next line
}
outfile.close();
offset=0;
//from here on we are reading the map
std::ifstream inmap("/Users/alexanderzywicki/Documents/xcode/textsearch/textsearch/map",std::ios::binary);
std::size_t line = 2;//your chosen line number
std::size_t idx = (line-1) * sizeof(offset); //the calculated offset
//seek into the map
inmap.seekg(idx);
//read the binary at that location
inmap.read((char*)&offset, sizeof(offset));
std::cout<<offset<<std::endl;
//from here you just need to lookup from the data file in the same manor
return 0;
}
Well, you could do:
It's not one line, but if you made it one line you'd either be screwed for error checking, or be leaving resources open longer than you need them, so I'd say keep the multiple lines
Edit
If you ABSOLUTELY know the file exists, you can use a one-liner:
The reason is that PHP implements RAII for resources.
That means that when the file handle goes out of scope (which happens immediately after the call to fgets in this case), it will be closed.