Viewed   71 times

I have a directory which contains several files, many of which has non-english name. I am using PHP in Windows 7.

I want to list the filename and their content using PHP.

Currently I am using DirectoryIterator and file_get_contents. This works for English files names but not for non-English (chinese) file names.

For example, I have filenames like "?? ?? ?????????.eml", "hello ??????.eml".

  1. DirectoryIterator is not able to get the filename using ->getFilename()
  2. file_get_contents is also not able to open even if I hard code the filename in its parameter.

How can I do it?

 Answers

2

This is not possible. It's a limitation of PHP. PHP uses the multibyte versions of Windows APIs; you're limited to the characters your codepage can represent.

See this answer.

Directory contents:

D:UsersCataphractDesktopteste2>dir
 Volume in drive D is GRANDEDISCO
 Volume Serial Number is 945F-DB89

 Directory of D:UsersCataphractDesktopteste2

01-06-2010  17:16              .
01-06-2010  17:16              ..
01-06-2010  17:15                 0 coptic small letter shima follows ?.txt
01-06-2010  17:18                86 teste.php
               2 File(s)             86 bytes
               2 Dir(s)  12.178.505.728 bytes free

Test file contents:

<?php
exec('pause');
foreach (new DirectoryIterator(".") as $v) {
    echo $v."n";
}

Test file results:

.
..
coptic small letter shima follows ?.txt
teste.php

Debugger output:

Call stack (PHP 5.3.0):

>   php5ts_debug.dll!readdir_r(DIR * dp=0x02f94068, dirent * entry=0x00a7e7cc, dirent * * result=0x00a7e7c0)  Line 80   C
    php5ts_debug.dll!php_plain_files_dirstream_read(_php_stream * stream=0x02b94280, char * buf=0x02b9437c, unsigned int count=260, void * * * tsrm_ls=0x028a15c0)  Line 820 + 0x17 bytes   C
    php5ts_debug.dll!_php_stream_read(_php_stream * stream=0x02b94280, char * buf=0x02b9437c, unsigned int size=260, void * * * tsrm_ls=0x028a15c0)  Line 603 + 0x1c bytes  C
    php5ts_debug.dll!_php_stream_readdir(_php_stream * dirstream=0x02b94280, _php_stream_dirent * ent=0x02b9437c, void * * * tsrm_ls=0x028a15c0)  Line 1806 + 0x16 bytes    C
    php5ts_debug.dll!spl_filesystem_dir_read(_spl_filesystem_object * intern=0x02b94340, void * * * tsrm_ls=0x028a15c0)  Line 199 + 0x20 bytes  C
    php5ts_debug.dll!spl_filesystem_dir_open(_spl_filesystem_object * intern=0x02b94340, char * path=0x02b957f0, void * * * tsrm_ls=0x028a15c0)  Line 238 + 0xd bytes   C
    php5ts_debug.dll!spl_filesystem_object_construct(int ht=1, _zval_struct * return_value=0x02b91f88, _zval_struct * * return_value_ptr=0x00000000, _zval_struct * this_ptr=0x02b92028, int return_value_used=0, void * * * tsrm_ls=0x028a15c0, long ctor_flags=0)  Line 645 + 0x11 bytes  C
    php5ts_debug.dll!zim_spl_DirectoryIterator___construct(int ht=1, _zval_struct * return_value=0x02b91f88, _zval_struct * * return_value_ptr=0x00000000, _zval_struct * this_ptr=0x02b92028, int return_value_used=0, void * * * tsrm_ls=0x028a15c0)  Line 658 + 0x1f bytes   C
    php5ts_debug.dll!zend_do_fcall_common_helper_SPEC(_zend_execute_data * execute_data=0x02bc0098, void * * * tsrm_ls=0x028a15c0)  Line 313 + 0x78 bytes   C
    php5ts_debug.dll!ZEND_DO_FCALL_BY_NAME_SPEC_HANDLER(_zend_execute_data * execute_data=0x02bc0098, void * * * tsrm_ls=0x028a15c0)  Line 423  C
    php5ts_debug.dll!execute(_zend_op_array * op_array=0x02b93888, void * * * tsrm_ls=0x028a15c0)  Line 104 + 0x11 bytes    C
    php5ts_debug.dll!zend_execute_scripts(int type=8, void * * * tsrm_ls=0x028a15c0, _zval_struct * * retval=0x00000000, int file_count=3, ...)  Line 1188 + 0x21 bytes C
    php5ts_debug.dll!php_execute_script(_zend_file_handle * primary_file=0x00a7fad4, void * * * tsrm_ls=0x028a15c0)  Line 2196 + 0x1b bytes C
    php.exe!main(int argc=2, char * * argv=0x028a14c0)  Line 1188 + 0x13 bytes  C
    php.exe!__tmainCRTStartup()  Line 555 + 0x19 bytes  C
    php.exe!mainCRTStartup()  Line 371  C

Is it really a question mark?

dp->fileinfo
{dwFileAttributes=32 ftCreationTime={...} ftLastAccessTime={...} ...}
    dwFileAttributes: 32
    ftCreationTime: {dwLowDateTime=2784934701 dwHighDateTime=30081445 }
    ftLastAccessTime: {dwLowDateTime=2784934701 dwHighDateTime=30081445 }
    ftLastWriteTime: {dwLowDateTime=2784934701 dwHighDateTime=30081445 }
    nFileSizeHigh: 0
    nFileSizeLow: 0
    dwReserved0: 3435973836
    dwReserved1: 3435973836
    cFileName: 0x02f9409c "coptic small letter shima follows ?.txt"
    cAlternateFileName: 0x02f941a0 "COPTIC~1.TXT"
dp->fileinfo.cFileName[34]
63 '?'

Yes! It's character #63.

Thursday, August 18, 2022
3

This is fixed in the upcoming PrimeFaces 6.2, but for earlier versions the fix below needs to be applied. In a link in the comments below a reference to a PrimeFaces issue was posted which contains info that the fix below does work for Chrome, IE and Opera but not for FireFox (no version mentioned, nor is 'Edge' mentioned)

Workaround

Try to encode your file name in application/x-www-form-urlencoded MIME format (URLEncoder).

Example:

public StreamedContent getFileDown () {
        // Get current position in file table
        this.currentPosition();
        attachments = getAttachments();
        Attachment a = getAttachmentByPosition( pos, attachments );

        FileNameMap fileNameMap = URLConnection.getFileNameMap();
        // Detecting MIME type
        String mimeType = fileNameMap.getContentTypeFor(a.getAttachmentName());
        String escapedFilename = "Unrecognized!!!";
        try {
            // Encoding
            escapedFilename = URLEncoder.encode(a.getAttachmentName(), "UTF-8").replaceAll(
                    "\+", "%20");
        } catch (UnsupportedEncodingException e1) {         
            e1.printStackTrace();
        }
        // Preparing streamed content
        fileDown = new DefaultStreamedContent( new ByteArrayInputStream( a.getAttachment() ),
                mimeType, escapedFilename);
        return fileDown;
    }
Saturday, August 13, 2022
3

You should implement a Comparator to sort the files based on the attributes you mentioned, and pass this as an argument to the Arrays.sort method.

    Arrays.sort(list, new Comparator<File>()
    {
        public int compare(File file1, File file2)
        {
            int result = ...
            .... comparison logic
            return result;
        }
    });
Saturday, September 17, 2022
 
1

Try finfo_file(). You have to call it passing the filepath. Example:

$finfo = finfo_open(FILEINFO_MIME_TYPE);
$mime = finfo_file($finfo, $_FILES['control_name_from_client']['tmp_name']);
finfo_close($finfo);

You need the Fileinfo extension. As PHP manual says:

The functions in this module try to guess the content type and encoding of a file by looking for certain magic byte sequences at specific positions within the file. While this is not a bullet proof approach the heuristics used do a very good job.

Monday, November 21, 2022
 
niccolo
 
2

Start here: PHP FTP recursive directory listing.

You just need to adjust the code to:

  1. the DOS-style listing you have from your FTP server (IIS probably) and
  2. to collect only the folders.
function ftp_list_dirs_recursive($ftp_stream, $directory)
{
    $result = [];

    $lines = ftp_rawlist($ftp_stream, $directory);
    if ($lines === false)
    {
        die("Cannot list $directory");
    }

    foreach ($lines as $line)
    {
        // rather lame parsing as a quick example:
        if (strpos($line, "<DIR>") !== false)
        {
            $dir_path = $directory . "/" . ltrim(substr($line, strpos($line, ">") + 1));
            $subdirs = ftp_list_dirs_recursive($ftp_stream, $dir_path);
            $result = array_merge($result, [$dir_path], $subdirs);
        }
    }
    return $result;
}
Wednesday, August 24, 2022
 
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :