189 lines
11 KiB
Django/Jinja
189 lines
11 KiB
Django/Jinja
<insecure_file_uploads_guide>
|
|
<title>INSECURE FILE UPLOADS</title>
|
|
|
|
<critical>Upload surfaces are high risk: server-side execution (RCE), stored XSS, malware distribution, storage takeover, and DoS. Modern stacks mix direct-to-cloud uploads, background processors, and CDNs—authorization and validation must hold across every step.</critical>
|
|
|
|
<scope>
|
|
- Web/mobile/API uploads, direct-to-cloud (S3/GCS/Azure) presigned flows, resumable/multipart protocols (tus, S3 MPU)
|
|
- Image/document/media pipelines (ImageMagick/GraphicsMagick, Ghostscript, ExifTool, PDF engines, office converters)
|
|
- Admin/bulk importers, archive uploads (zip/tar), report/template uploads, rich text with attachments
|
|
- Serving paths: app directly, object storage, CDN, email attachments, previews/thumbnails
|
|
</scope>
|
|
|
|
<methodology>
|
|
1. Map the pipeline: client → ingress (edge/app/gateway) → storage → processors (thumb, OCR, AV, CDR) → serving (app/storage/CDN). Note where validation and auth occur.
|
|
2. Identify allowed types, size limits, filename rules, storage keys, and who serves the content. Collect baseline uploads per type and capture resulting URLs and headers.
|
|
3. Exercise bypass families systematically: extension games, MIME/content-type, magic bytes, polyglots, metadata payloads, archive structure, chunk/finalize differentials.
|
|
4. Validate execution and rendering: can uploaded content execute on server or client? Confirm with minimal PoCs and headers analysis.
|
|
</methodology>
|
|
|
|
<discovery_techniques>
|
|
<surface_map>
|
|
- Endpoints/fields: upload, file, avatar, image, attachment, import, media, document, template
|
|
- Direct-to-cloud params: key, bucket, acl, Content-Type, Content-Disposition, x-amz-meta-*, cache-control
|
|
- Resumable APIs: create/init → upload/chunk → complete/finalize; check if metadata/headers can be altered late
|
|
- Background processors: thumbnails, PDF→image, virus scan queues; identify timing and status transitions
|
|
</surface_map>
|
|
|
|
<capability_probes>
|
|
- Small probe files of each claimed type; diff resulting Content-Type, Content-Disposition, and X-Content-Type-Options on download
|
|
- Magic bytes vs extension: JPEG/GIF/PNG headers; mismatches reveal reliance on extension or MIME sniffing
|
|
- SVG/HTML probe: do they render inline (text/html or image/svg+xml) or download (attachment)?
|
|
- Archive probe: simple zip with nested path traversal entries and symlinks to detect extraction rules
|
|
</capability_probes>
|
|
</discovery_techniques>
|
|
|
|
<detection_channels>
|
|
<server_execution>
|
|
- Web shell execution (language dependent), config/handler uploads (.htaccess, .user.ini, web.config) enabling execution
|
|
- Interpreter-side template/script evaluation during conversion (ImageMagick/Ghostscript/ExifTool)
|
|
</server_execution>
|
|
|
|
<client_execution>
|
|
- Stored XSS via SVG/HTML/JS if served inline without correct headers; PDF JavaScript; office macros in previewers
|
|
</client_execution>
|
|
|
|
<header_and_render>
|
|
- Missing X-Content-Type-Options: nosniff enabling browser sniff to script
|
|
- Content-Type reflection from upload vs server-set; Content-Disposition: inline vs attachment
|
|
</header_and_render>
|
|
|
|
<process_side_effects>
|
|
- AV/CDR race or absence; background job status allows access before scan completes; password-protected archives bypass scanning
|
|
</process_side_effects>
|
|
</detection_channels>
|
|
|
|
<core_payloads>
|
|
<web_shells_and_configs>
|
|
- PHP: GIF polyglot (starts with GIF89a) followed by <?php echo 1; ?>; place where PHP is executed
|
|
- .htaccess to map extensions to code (AddType/AddHandler); .user.ini (auto_prepend/append_file) for PHP-FPM
|
|
- ASP/JSP equivalents where supported; IIS web.config to enable script execution
|
|
</web_shells_and_configs>
|
|
|
|
<stored_xss>
|
|
- SVG with onload/onerror handlers served as image/svg+xml or text/html
|
|
- HTML file with script when served as text/html or sniffed due to missing nosniff
|
|
</stored_xss>
|
|
|
|
<mime_magic_polyglots>
|
|
- Double extensions: avatar.jpg.php, report.pdf.html; mixed casing: .pHp, .PhAr
|
|
- Magic-byte spoofing: valid JPEG header then embedded script; verify server uses content inspection, not extensions alone
|
|
</mime_magic_polyglots>
|
|
|
|
<archive_attacks>
|
|
- Zip Slip: entries with ../../ to escape extraction dir; symlink-in-zip pointing outside target; nested zips
|
|
- Zip bomb: extreme compression ratios (e.g., 42.zip) to exhaust resources in processors
|
|
</archive_attacks>
|
|
|
|
<toolchain_exploits>
|
|
- ImageMagick/GraphicsMagick legacy vectors (policy.xml may mitigate): crafted SVG/PS/EPS invoking external commands or reading files
|
|
- Ghostscript in PDF/PS with file operators (%pipe%)
|
|
- ExifTool metadata parsing bugs; overly large or crafted EXIF/IPTC/XMP fields
|
|
</toolchain_exploits>
|
|
|
|
<cloud_storage_vectors>
|
|
- S3/GCS presigned uploads: attacker controls Content-Type/Disposition; set text/html or image/svg+xml and inline rendering
|
|
- Public-read ACL or permissive bucket policies expose uploads broadly; object key injection via user-controlled path prefixes
|
|
- Signed URL reuse and stale URLs; serving directly from bucket without attachment + nosniff headers
|
|
</cloud_storage_vectors>
|
|
</core_payloads>
|
|
|
|
<advanced_techniques>
|
|
<resumable_multipart>
|
|
- Change metadata between init and complete (e.g., swap Content-Type/Disposition at finalize)
|
|
- Upload benign chunks, then swap last chunk or complete with different source if server trusts client-side digests only
|
|
</resumable_multipart>
|
|
|
|
<filename_and_path>
|
|
- Unicode homoglyphs, trailing dots/spaces, device names, reserved characters to bypass validators and filesystem rules
|
|
- Null-byte truncation on legacy stacks; overlong paths; case-insensitive collisions overwriting existing files
|
|
</filename_and_path>
|
|
|
|
<processing_races>
|
|
- Request file immediately after upload but before AV/CDR completes; or during derivative creation to get unprocessed content
|
|
- Trigger heavy conversions (large images, deep PDFs) to widen race windows
|
|
</processing_races>
|
|
|
|
<metadata_abuse>
|
|
- Oversized EXIF/XMP/IPTC blocks to trigger parser flaws; payloads in document properties of Office/PDF rendered by previewers
|
|
</metadata_abuse>
|
|
|
|
<header_manipulation>
|
|
- Force inline rendering with Content-Type + inline Content-Disposition; test browsers with and without nosniff
|
|
- Cache poisoning via CDN with keys missing Vary on Content-Type/Disposition
|
|
</header_manipulation>
|
|
</advanced_techniques>
|
|
|
|
<filter_bypasses>
|
|
<validation_gaps>
|
|
- Client-side only checks; relying on JS/MIME provided by browser; trusting multipart boundary part headers blindly
|
|
- Extension allowlists without server-side content inspection; magic-bytes only without full parsing
|
|
</validation_gaps>
|
|
|
|
<evasion_tricks>
|
|
- Double extensions, mixed case, hidden dotfiles, extra dots (file..png), long paths with allowed suffix
|
|
- Multipart name vs filename vs path discrepancies; duplicate parameters and late parameter precedence
|
|
</evasion_tricks>
|
|
</filter_bypasses>
|
|
|
|
<special_contexts>
|
|
<rich_text_editors>
|
|
- RTEs allow image/attachment uploads and embed links; verify sanitization and serving headers for embedded content
|
|
</rich_text_editors>
|
|
|
|
<mobile_clients>
|
|
- Mobile SDKs may send nonstandard MIME or metadata; servers sometimes trust client-side transformations or EXIF orientation
|
|
</mobile_clients>
|
|
|
|
<serverless_and_cdn>
|
|
- Direct-to-bucket uploads with Lambda/Workers post-processing; verify that security decisions are not delegated to frontends
|
|
- CDN caching of uploaded content; ensure correct cache keys and headers (attachment, nosniff)
|
|
</serverless_and_cdn>
|
|
</special_contexts>
|
|
|
|
<parser_hardening>
|
|
- Validate on server: strict allowlist by true type (parse enough to confirm), size caps, and structural checks (dimensions, page count)
|
|
- Strip active content: convert SVG→PNG; remove scripts/JS from PDF; disable macros; normalize EXIF; consider CDR for risky types
|
|
- Store outside web root; serve via application or signed, time-limited URLs with Content-Disposition: attachment and X-Content-Type-Options: nosniff
|
|
- For cloud: private buckets, per-request signed GET, enforce Content-Type/Disposition on GET responses from your app/gateway
|
|
- Disable execution in upload paths; ignore .htaccess/.user.ini; sanitize keys to prevent path injections; randomize filenames
|
|
- AV + CDR: scan synchronously when possible; quarantine until verdict; block password-protected archives or process in sandbox
|
|
</parser_hardening>
|
|
|
|
<validation>
|
|
1. Demonstrate execution or rendering of active content: web shell reachable, or SVG/HTML executing JS when viewed.
|
|
2. Show filter bypass: upload accepted despite restrictions (extension/MIME/magic mismatch) with evidence on retrieval.
|
|
3. Prove header weaknesses: inline rendering without nosniff or missing attachment; present exact response headers.
|
|
4. Show race or pipeline gap: access before AV/CDR; extraction outside intended directory; derivative creation from malicious input.
|
|
5. Provide reproducible steps: request/response for upload and subsequent access, with minimal PoCs.
|
|
</validation>
|
|
|
|
<false_positives>
|
|
- Upload stored but never served back; or always served as attachment with strict nosniff
|
|
- Converters run in locked-down sandboxes with no external IO and no script engines; no path traversal on archive extraction
|
|
- AV/CDR blocks the payload and quarantines; access before scan is impossible by design
|
|
</false_positives>
|
|
|
|
<impact>
|
|
- Remote code execution on application stack or media toolchain host
|
|
- Persistent cross-site scripting and session/token exfiltration via served uploads
|
|
- Malware distribution via public storage/CDN; brand/reputation damage
|
|
- Data loss or corruption via overwrite/zip slip; service degradation via zip bombs or oversized assets
|
|
</impact>
|
|
|
|
<pro_tips>
|
|
1. Keep PoCs minimal: tiny SVG/HTML for XSS, a single-line PHP/ASP where relevant, and benign magic-byte polyglots.
|
|
2. Always capture download response headers and final MIME from the server/CDN; that decides browser behavior.
|
|
3. Prefer transforming risky formats to safe renderings (SVG→PNG) rather than attempting complex sanitization.
|
|
4. In presigned flows, constrain all headers and object keys server-side; ignore client-supplied ACL and metadata.
|
|
5. For archives, extract in a chroot/jail with explicit allowlist; drop symlinks and reject traversal.
|
|
6. Test finalize/complete steps in resumable flows; many validations only run on init, not at completion.
|
|
7. Verify background processors with EICAR and tiny polyglots; ensure quarantine gates access until safe.
|
|
8. When you cannot get execution, aim for stored XSS or header-driven script execution; both are impactful.
|
|
9. Validate that CDNs honor attachment/nosniff and do not override Content-Type/Disposition.
|
|
10. Document full pipeline behavior per asset type; defenses must match actual processors and serving paths.
|
|
</pro_tips>
|
|
|
|
<remember>Secure uploads are a pipeline property. Enforce strict type, size, and header controls; transform or strip active content; never execute or inline-render untrusted uploads; and keep storage private with controlled, signed access.</remember>
|
|
</insecure_file_uploads_guide>
|