Question:
How to read and echo file size of uploaded file being written at server in real time without blocking at both server and client?
Context:
Progress of file upload being written to server from POST
request made by fetch()
, where body
is set to Blob
, File
, TypedArray
, or ArrayBuffer
object.
The current implementation sets File
object at body
object passed to second parameter of fetch()
.
Requirement:
Read and echo
to client the file size of file being written to filesystem at server as text/event-stream
. Stop when all of the bytes, provided as a variable to the script as a query string parameter at GET
request have been written. The read of the file currently takes place at a separate script environment, where GET
call to script which should read file is made following POST
to script which writes file to server.
Have not reached error handling of potential issue with write of file to server or read of file to get current file size, though that would be next step once echo
of file size portion is completed.
Presently attempting to meet requirement using php
. Though also interested in c
, bash
, nodejs
, python
; or other languages or approaches which can be used to perform same task.
The client side javascript
portion is not an issue. Simply not that versed in php
, one of the most common server-side languages used at world wide web, to implement the pattern without including parts which are not necessary.
Motivation:
Progress indicators for fetch?
Related:
Fetch with ReadableStream
Issues:
Getting
PHP Notice: Undefined index: HTTP_LAST_EVENT_ID in stream.php on line 7
at terminal
.
Also, if substitute
while(file_exists($_GET["filename"])
&& filesize($_GET["filename"]) < intval($_GET["filesize"]))
for
while(true)
produces error at EventSource
.
Without sleep()
call, correct file size was dispatched to message
event for a 3.3MB
file, 3321824
, was printed at console
61921
, 26214
, and 38093
times, respectively, when uploaded same file three times. The expected result is file size of file as the file is being written at
stream_copy_to_stream($input, $file);
instead of file size of uploaded file object. Are fopen()
or stream_copy_to_stream()
blocking as to other a different php
process at stream.php
?
Tried so far:
php
is attributed to
- Beyond $_POST, $_GET and $_FILE: Working with Blob in JavaScriptPHP
- Introduction to Server-Sent Events with PHP example
php
// can we merge `data.php`, `stream.php` to same file?
// can we use `STREAM_NOTIFY_PROGRESS`
// "Indicates current progress of the stream transfer
// in bytes_transferred and possibly bytes_max as well" to read bytes?
// do we need to call `stream_set_blocking` to `false`
// data.php
<?php
$filename = $_SERVER["HTTP_X_FILENAME"];
$input = fopen("php://input", "rb");
$file = fopen($filename, "wb");
stream_copy_to_stream($input, $file);
fclose($input);
fclose($file);
echo "upload of " . $filename . " successful";
?>
// stream.php
<?php
header("Content-Type: text/event-stream");
header("Cache-Control: no-cache");
header("Connection: keep-alive");
// `PHP Notice: Undefined index: HTTP_LAST_EVENT_ID in stream.php on line 7` ?
$lastId = $_SERVER["HTTP_LAST_EVENT_ID"] || 0;
if (isset($lastId) && !empty($lastId) && is_numeric($lastId)) {
$lastId = intval($lastId);
$lastId++;
}
// else {
// $lastId = 0;
// }
// while current file size read is less than or equal to
// `$_GET["filesize"]` of `$_GET["filename"]`
// how to loop only when above is `true`
while (true) {
$upload = $_GET["filename"];
// is this the correct function and variable to use
// to get written bytes of `stream_copy_to_stream($input, $file);`?
$data = filesize($upload);
// $data = $_GET["filename"] . " " . $_GET["filesize"];
if ($data) {
sendMessage($lastId, $data);
$lastId++;
}
// else {
// close stream
// }
// not necessary here, though without thousands of `message` events
// will be dispatched
// sleep(1);
}
function sendMessage($id, $data) {
echo "id: $idn";
echo "data: $datann";
ob_flush();
flush();
}
?>
javascript
<!DOCTYPE html>
<html>
<head>
</head>
<body>
<input type="file">
<progress value="0" max="0" step="1"></progress>
<script>
const [url, stream, header] = ["data.php", "stream.php", "x-filename"];
const [input, progress, handleFile] = [
document.querySelector("input[type=file]")
, document.querySelector("progress")
, (event) => {
const [file] = input.files;
const [{size:filesize, name:filename}, headers, params] = [
file, new Headers(), new URLSearchParams()
];
// set `filename`, `filesize` as search parameters for `stream` URL
Object.entries({filename, filesize})
.forEach(([...props]) => params.append.apply(params, props));
// set header for `POST`
headers.append(header, filename);
// reset `progress.value` set `progress.max` to `filesize`
[progress.value, progress.max] = [0, filesize];
const [request, source] = [
new Request(url, {
method:"POST", headers:headers, body:file
})
// https://.com/a/42330433/
, new EventSource(`${stream}?${params.toString()}`)
];
source.addEventListener("message", (e) => {
// update `progress` here,
// call `.close()` when `e.data === filesize`
// `progress.value = e.data`, should be this simple
console.log(e.data, e.lastEventId);
}, true);
source.addEventListener("open", (e) => {
console.log("fetch upload progress open");
}, true);
source.addEventListener("error", (e) => {
console.error("fetch upload progress error");
}, true);
// sanity check for tests,
// we don't need `source` when `e.data === filesize`;
// we could call `.close()` within `message` event handler
setTimeout(() => source.close(), 30000);
// we don't need `source' to be in `Promise` chain,
// though we could resolve if `e.data === filesize`
// before `response`, then wait for `.text()`; etc.
// TODO: if and where to merge or branch `EventSource`,
// `fetch` to single or two `Promise` chains
const upload = fetch(request);
upload
.then(response => response.text())
.then(res => console.log(res))
.catch(err => console.error(err));
}
];
input.addEventListener("change", handleFile, true);
</script>
</body>
</html>
You need to clearstatcache to get real file size. With few other bits fixed, your stream.php may look like following:
Few caveats:
Security. I mean luck of it. As I understand it is a proof of concept, and security is the least of concerns, yet the disclaimer should be there. This approach is fundamentally flawed, and should be used only if you don't care of DOS attacks or information about your files goes out.
CPU. Without
usleep
the script will consume 100% of a single core. With long sleep you are at risk of uploading the whole file within a single iteration and the exit condition will be never met. If you are testing it locally, theusleep
should be removed completely, since it is matter of milliseconds to upload MBs locally.Open connections. Both apache and nginx/fpm have finite number of php processes that can serve the requests. A single file upload will takes 2 for the time required to upload the file. With slow bandwidth or forged requests, this time can be quite long, and the web server may start to reject requests.
Clientside part. You need to analyse the response and finally stop listening to the events when the file is fully uploaded.
EDIT:
To make it more or less production friendly, you will need an in-memory storage like redis, or memcache to store file metadata.
Making a post request, add a unique token which identify the file, and the file size.
In your javascript:
In data.php register the token and report progress by chunks:
So your stream.php don't need to hit the disk at all, and can sleep as long as it is acceptable by UX:
The problem with 2 open connections cannot be solved unless you give up EventSource for old good pulling. Response time of stream.php without loop is a matter of milliseconds, and it is quite wasteful to keep the connection open all the time, unless you need hundreds updates a second.