Files

0xallam 6cb1c20978 refactor: migrate skills from Jinja to Markdown

2026-01-20 12:50:59 -08:00

10 KiB

Raw Blame History

INSECURE FILE UPLOADS

Critical

Upload surfaces are high risk: server-side execution (RCE), stored XSS, malware distribution, storage takeover, and DoS. Modern stacks mix direct-to-cloud uploads, background processors, and CDNs—authorization and validation must hold across every step.

Scope

Web/mobile/API uploads, direct-to-cloud (S3/GCS/Azure) presigned flows, resumable/multipart protocols (tus, S3 MPU)
Image/document/media pipelines (ImageMagick/GraphicsMagick, Ghostscript, ExifTool, PDF engines, office converters)
Admin/bulk importers, archive uploads (zip/tar), report/template uploads, rich text with attachments
Serving paths: app directly, object storage, CDN, email attachments, previews/thumbnails

Methodology

Map the pipeline: client → ingress (edge/app/gateway) → storage → processors (thumb, OCR, AV, CDR) → serving (app/storage/CDN). Note where validation and auth occur.
Identify allowed types, size limits, filename rules, storage keys, and who serves the content. Collect baseline uploads per type and capture resulting URLs and headers.
Exercise bypass families systematically: extension games, MIME/content-type, magic bytes, polyglots, metadata payloads, archive structure, chunk/finalize differentials.
Validate execution and rendering: can uploaded content execute on server or client? Confirm with minimal PoCs and headers analysis.

Discovery Techniques

Surface Map

Endpoints/fields: upload, file, avatar, image, attachment, import, media, document, template
Direct-to-cloud params: key, bucket, acl, Content-Type, Content-Disposition, x-amz-meta-*, cache-control
Resumable APIs: create/init → upload/chunk → complete/finalize; check if metadata/headers can be altered late
Background processors: thumbnails, PDF→image, virus scan queues; identify timing and status transitions

Capability Probes

Small probe files of each claimed type; diff resulting Content-Type, Content-Disposition, and X-Content-Type-Options on download
Magic bytes vs extension: JPEG/GIF/PNG headers; mismatches reveal reliance on extension or MIME sniffing
SVG/HTML probe: do they render inline (text/html or image/svg+xml) or download (attachment)?
Archive probe: simple zip with nested path traversal entries and symlinks to detect extraction rules

Detection Channels

Server Execution

Web shell execution (language dependent), config/handler uploads (.htaccess, .user.ini, web.config) enabling execution
Interpreter-side template/script evaluation during conversion (ImageMagick/Ghostscript/ExifTool)

Client Execution

Stored XSS via SVG/HTML/JS if served inline without correct headers; PDF JavaScript; office macros in previewers

Header And Render

Missing X-Content-Type-Options: nosniff enabling browser sniff to script
Content-Type reflection from upload vs server-set; Content-Disposition: inline vs attachment

Process Side Effects

AV/CDR race or absence; background job status allows access before scan completes; password-protected archives bypass scanning

Core Payloads

Web Shells And Configs

PHP: GIF polyglot (starts with GIF89a) followed by <?php echo 1; ?>; place where PHP is executed
.htaccess to map extensions to code (AddType/AddHandler); .user.ini (auto_prepend/append_file) for PHP-FPM
ASP/JSP equivalents where supported; IIS web.config to enable script execution

Stored Xss

SVG with onload/onerror handlers served as image/svg+xml or text/html
HTML file with script when served as text/html or sniffed due to missing nosniff

Mime Magic Polyglots

Double extensions: avatar.jpg.php, report.pdf.html; mixed casing: .pHp, .PhAr
Magic-byte spoofing: valid JPEG header then embedded script; verify server uses content inspection, not extensions alone

Archive Attacks

Zip Slip: entries with ../../ to escape extraction dir; symlink-in-zip pointing outside target; nested zips
Zip bomb: extreme compression ratios (e.g., 42.zip) to exhaust resources in processors

Toolchain Exploits

ImageMagick/GraphicsMagick legacy vectors (policy.xml may mitigate): crafted SVG/PS/EPS invoking external commands or reading files
Ghostscript in PDF/PS with file operators (%pipe%)
ExifTool metadata parsing bugs; overly large or crafted EXIF/IPTC/XMP fields

Cloud Storage Vectors

S3/GCS presigned uploads: attacker controls Content-Type/Disposition; set text/html or image/svg+xml and inline rendering
Public-read ACL or permissive bucket policies expose uploads broadly; object key injection via user-controlled path prefixes
Signed URL reuse and stale URLs; serving directly from bucket without attachment + nosniff headers

Advanced Techniques

Resumable Multipart

Change metadata between init and complete (e.g., swap Content-Type/Disposition at finalize)
Upload benign chunks, then swap last chunk or complete with different source if server trusts client-side digests only

Filename And Path

Unicode homoglyphs, trailing dots/spaces, device names, reserved characters to bypass validators and filesystem rules
Null-byte truncation on legacy stacks; overlong paths; case-insensitive collisions overwriting existing files

Processing Races

Request file immediately after upload but before AV/CDR completes; or during derivative creation to get unprocessed content
Trigger heavy conversions (large images, deep PDFs) to widen race windows

Metadata Abuse

Oversized EXIF/XMP/IPTC blocks to trigger parser flaws; payloads in document properties of Office/PDF rendered by previewers

Header Manipulation

Force inline rendering with Content-Type + inline Content-Disposition; test browsers with and without nosniff
Cache poisoning via CDN with keys missing Vary on Content-Type/Disposition

Filter Bypasses

Validation Gaps

Client-side only checks; relying on JS/MIME provided by browser; trusting multipart boundary part headers blindly
Extension allowlists without server-side content inspection; magic-bytes only without full parsing

Evasion Tricks

Double extensions, mixed case, hidden dotfiles, extra dots (file..png), long paths with allowed suffix
Multipart name vs filename vs path discrepancies; duplicate parameters and late parameter precedence

Special Contexts

Rich Text Editors

RTEs allow image/attachment uploads and embed links; verify sanitization and serving headers for embedded content

Mobile Clients

Mobile SDKs may send nonstandard MIME or metadata; servers sometimes trust client-side transformations or EXIF orientation

Serverless And Cdn

Direct-to-bucket uploads with Lambda/Workers post-processing; verify that security decisions are not delegated to frontends
CDN caching of uploaded content; ensure correct cache keys and headers (attachment, nosniff)

Parser Hardening

Validate on server: strict allowlist by true type (parse enough to confirm), size caps, and structural checks (dimensions, page count)
Strip active content: convert SVG→PNG; remove scripts/JS from PDF; disable macros; normalize EXIF; consider CDR for risky types
Store outside web root; serve via application or signed, time-limited URLs with Content-Disposition: attachment and X-Content-Type-Options: nosniff
For cloud: private buckets, per-request signed GET, enforce Content-Type/Disposition on GET responses from your app/gateway
Disable execution in upload paths; ignore .htaccess/.user.ini; sanitize keys to prevent path injections; randomize filenames
AV + CDR: scan synchronously when possible; quarantine until verdict; block password-protected archives or process in sandbox

Validation

Demonstrate execution or rendering of active content: web shell reachable, or SVG/HTML executing JS when viewed.
Show filter bypass: upload accepted despite restrictions (extension/MIME/magic mismatch) with evidence on retrieval.
Prove header weaknesses: inline rendering without nosniff or missing attachment; present exact response headers.
Show race or pipeline gap: access before AV/CDR; extraction outside intended directory; derivative creation from malicious input.
Provide reproducible steps: request/response for upload and subsequent access, with minimal PoCs.

False Positives

Upload stored but never served back; or always served as attachment with strict nosniff
Converters run in locked-down sandboxes with no external IO and no script engines; no path traversal on archive extraction
AV/CDR blocks the payload and quarantines; access before scan is impossible by design

Impact

Remote code execution on application stack or media toolchain host
Persistent cross-site scripting and session/token exfiltration via served uploads
Malware distribution via public storage/CDN; brand/reputation damage
Data loss or corruption via overwrite/zip slip; service degradation via zip bombs or oversized assets

Pro Tips

Keep PoCs minimal: tiny SVG/HTML for XSS, a single-line PHP/ASP where relevant, and benign magic-byte polyglots.
Always capture download response headers and final MIME from the server/CDN; that decides browser behavior.
Prefer transforming risky formats to safe renderings (SVG→PNG) rather than attempting complex sanitization.
In presigned flows, constrain all headers and object keys server-side; ignore client-supplied ACL and metadata.
For archives, extract in a chroot/jail with explicit allowlist; drop symlinks and reject traversal.
Test finalize/complete steps in resumable flows; many validations only run on init, not at completion.
Verify background processors with EICAR and tiny polyglots; ensure quarantine gates access until safe.
When you cannot get execution, aim for stored XSS or header-driven script execution; both are impactful.
Validate that CDNs honor attachment/nosniff and do not override Content-Type/Disposition.
Document full pipeline behavior per asset type; defenses must match actual processors and serving paths.

Remember

Secure uploads are a pipeline property. Enforce strict type, size, and header controls; transform or strip active content; never execute or inline-render untrusted uploads; and keep storage private with controlled, signed access.

10 KiB Raw Blame History