Sunday, March 20, 2016

Libvirt

1. Libvirt Python API


  • Compute manager calls the driver
    • `self.driven.pause(instance)`
      • It will suspend the instance
  • Nova Libvirt Driver

2. Libvirt Domain 

  • Libvirt domains are defined via XML
  • Domains defined by Nova are persistent
  • XML is re-generated on Hard Reboot
    • Manual costomerizations will be overriden
  • Logging occurs per-domain
    • /var/log/libvirt/qemu/instance_name.log

3. Virsh 

  • virsh is the command line tool for Libvirt
    • Consumes the same API referenced earlier via C
  • virsh list
    • List all domains by name
  • virsh domname uuid
    • Returns the instance_name
  • virsh dumpxml uuid
    • Configuration for an individual domain

4.  Libvirt Domain XML

  • Translated by Libvirt to ultimately call QEMU with the right arguments
  • The XML generated by Nova ends up being modified, fairly heavility, by Libvirt upon definition
  • Key difference between "active" and "inactive" XML
    • Numerous values are derived at time of instance start
    • Output from an active "dumpxml" will likely fail to define
      • Use `virsh dumpxml --inactive uuid `

    5. Libvirtd Configuration

    • Critically important information tuning for Libvirt 1.1.x and newer, example
      • max_clients = 50

    Saturday, March 19, 2016

    Live Migration


    1. Life Migration Workflow

    • Verify the storage backend is appropriate for the migration type
      • Perform a shared storage check for normal migrations
      • Do the inverse for block migrations
      • Checks are run on both the source and destination, orchestrated via RPC calls from the scheduler
    • On the destination
      • Create the necessary volume connections
      • If block migration, create the instance directory, populate missing backing files from Glance and create empty instance disks
    • On the source
      • Initiate the actual live migration
    • Upon complete
      • Generate the Libvirt XML and define it on the destination

    2. Migrations

    • Why migration
      • Operations
        • Key to performing non-distruptive work
        • Re-balancing workloads and resources
      • Expectations versus reality
        • Special snowflakes
        • Ephemeral instance and the "cloud" way
    • Type of migration
      • Migrate
        • Completely "cold", libvirt does almost nothing
        • Share code path with "resize"
        • Extremely brittle (users SSh and copies files around)
      • Live migration
        • Orchestrated almost entirely by Libvirt (via DomainMigrateToURI)
      • Block migration
        • Similar code path as live migraiton
        • More risky and brittle (disks are moving along with state)

    3. Live Migrations

    • Nova offloads capabilities comparisons to Libvirt
      • The API equivalent of virsh capabilities is run by the scheduler on the source and destination; 
    • Nova live migraiton
      • Important config options
        • Live_migration_flat =+ VIR_MIGRATE_LIVE
        • block_migration_flag=+ VIR_MIGRATE_LIVE
      • Standardized virtual CPU flags
        • libvirt_cpu_model = custome
        • libvirt_cpu_model = cpu64-rhel6
      • "Max Downtime" (not currently tunable)
        • Look for upstream patches soon
        • Qemu will keep doing when the cut can be done in "30" millseconds

    4. Brittle Operations

    • Any long running, synchronous tasks
      • All migrations (memory sync, disk sync, etc)
    • No graceful way to stop services
    • Most prone to failure
      • Migrate and resize
      • Live migraiton (block or otherwise)
      • Instance snapshot

    5. Recovering from failures

    • Always investigate before forcing actions
      • Look at the log for excpetions
      • Check whether an instance is running on multiple hypervisors
      • Nova reset-state --active and `nova reboot --hard can go a long way
    • Sometime, brute force is going to be required
      • Kill -9 qumu or kvm processes
      • After the database records, commonly `host`

    6. "Stuck" Live Migrations

    • Live migrations can get stuck
    • Instances left in a paused state on both ends
      • Monitor socket is unpresponsive, Libvirt is helpless
    • Generally a result of an overly aggressive "max donwtime" and rapidly changing memory state (e.g., JVM)
    • Can be a result of a QEMU issue/bug
      • manageSave (suspend) will generally be prone as well



    OpenStack Overview




    1. Architecture






    2. Commands


    • In compute node, run `virsh capabilities`, you can see the capability of that node.
    • `virsh dumpxml instance-id`
      • Describe the vm

    3. Reboot

    • Soft reboot
      • It relies completely on the guest OS and ACPI passed through QEMU
    • Hard reboot
      • Just make it work. 
      • It resolves most issue
      • It is at the hypervisor and Nova level 
      • It makes zero assumptions about the state of the hypervisor
        • Notable effort has been placed to make internal operations idempotent, and call them here.
      • Steps
        • Destroy the domain
          • Equivalent of `virsh destroy`
          • Does not destroy data, only the QEMU process
          • Effectively a `kill -9` of the QEMU process
        • Re-establish any and all volume connections.
        • Regenerate the Libvirt XML
        • Check for and re-download any missing backing files (instance_dir/_base)
        • Plug VIFs (re-create bridges, VLAN interfaces, etc.)
        • Regenerate and apply iptables rules



    Wednesday, March 9, 2016

    My paper list to read

    NDSS17

    A Large-scale Analysis of the Mnemonic Password Advice
    Show Me the Money! Finding Flawed Implementations of Third-party In-app Payment in Android Apps

    A Call to ARMs: Understanding the Costs and Benefits of JIT Spraying Mitigations

    Internet-scale Probing of CPS: Inference, Characterization and Orchestration Analysis

    Dachshund: Digging for and Securing (Non-)Blinded Constants in JIT Code

    Ramblr: Making Reassembly Great Again

    BOOMERANG: Exploiting the Semantic Gap in Trusted Execution Environments

    A Broad View of the Ecosystem of Socially Engineered Exploit Documents

    Dark Hazard: Learning-based, Large-Scale Discovery of Hidden Sensitive Operations in Android Apps

    ASLR on the Line: Practical Cache Attacks on the MMU

    Hey, My Malware Knows Physics! Attacking PLCs with Physical Model Aware Rootkit

    Wi-Fly?: Detecting Privacy Invasion Attacks by Consumer Drones

    HOP: Hardware makes Obfuscation Practical

    TenantGuard: Scalable Runtime Verification of Cloud-Wide VM-Level Network Isolation

    Broken Hearted: How To Attack ECG Biometrics

    DELTA: A Security Assessment Framework for Software-Defined Networks

    Obfuscation-Resilient Privacy Leak Detection for Mobile Apps Through Differential Analysis

    A2C: Self Destructing Exploit Executions via Input Perturbation

    Address Oblivious Code Reuse: On the Effectiveness of Leakage Resilient Diversity




    USENIX2016 

    You are Who You Know and How You Behave: Attribute Inference Attacks via Users' Social Friends and Behaviors 

    Stealing Machine Learning Models via Prediction APIs

    FlowFence: Practical Data Protection for Emerging IoT Application Frameworks

    Towards Measuring and Mitigating Social Engineering Malware Download Attacks

    Specification Mining for Intrusion Detection in Networked Control Systems

    APISan: Sanitizing API Usages through Semantic Cross-checking

    Undermining Entropy-based Information Hiding (And What to do About it)

    zxcvbn: Low-Budget Password Strength Estimation

    Mirror: Enabling Proofs of Data Replication and Retrievability in the Cloud

    ARMageddon: Cache Attacks on Mobile Devices 

    Hidden Voice Commands

    OblivP2P: An Oblivious Peer-to-Peer Content Sharing System

    AuthLoop: End-to-End Cryptographic Authentication for Telephony over Voice Channels

    Trusted Browsers for Uncertain Times

    Virtual U: Defeating Face Liveness Detection by Building Virtual Models From Your Public Photos

    One Bit Flips, One Cloud Flops: Cross-VM Row Hammer Attacks and Privilege Escalation

    All Your Queries Are Belong to Us:The Power of File-Injection Attacks on Searchable Encryption

    Fast, Lean, and Accurate: Modeling Password Guessability Using Neural Networks

    SGX-Enabled Oblivious Machine Learning

    Poking Holes into Information Hiding

    Off-Path TCP Exploits: Global Rate Limit Considered Dangerous

    Request and Conquer: Exposing Cross-Origin Resource Size



    Sigcomm


    WebPerf: Evaluating What-If Scenarios for Cloud-hosted Web Applications


    Taking the Blame Game out of Data Centers Operations with NetPoirot 




    SAC

    Accurate Spear Phishing Campaign Attribution and Early Detection

    Rich Cloud-Based Web Applications with CloudBrowser 2.0 

    Controlling the Elasticity of Web Applications on Cloud Computing


    AsiaCCS

    StormDroid: A Streaminglized Machine Learning-based System for Detecting Android Malware

    Bilateral-secure Signature by Key Evolving

    Efficient Authenticated Multi-Pattern Matching

    Attestation Transparency: Building secure Internet services for legacy clients

    Congesting the Internet with Coordinated And Decentralized Pulsating Attacks

    Privacy and Utility of Inference Control Mechanisms for Social Computing Applications

    StemJail: Dynamic Role Compartmentalization

    Your Credentials Are Compromised, Do Not Panic: You Can Be Well Protected


    DSN
    Power-aware Checkpointing: Toward the Optimal Checkpointing Interval under Power Capping

    A Sharper Sense of Self: Probabilistic Reasoning of Program Behaviors for Anomaly Detection with Context Sensitivity

    Characterizing the Consistency of Online Services

    Balancing Security and Performance for Agility in Dynamic Threat Environments
    Specification Mining for Intrusion Detection in Networked Control Systems



    CCS 2016
    SmartWalk: Enhancing Social Network Security via Adaptive Random Walks

    Acing the IOC Game: Toward Automatic Discovery and Analysis of Open-Source Cyber Threat Intelligence

    Content Security Problems? Evaluating the Effectiveness of Content Security Policy in the Wild

    CSP is Dead, Long Live CSP: On the Insecurity of Whitelists and the Future of the Content Security Policy

    CSPAutoGen: Black-box Enforcement of Content Security Policy upon Real-World Websites

    A EpicRec: Towards Practical Differentially Private Framework for Personalized Recommendation

    Generic Attacks on Secure Outsourced Databases

    Identifying the Scanners and Attack Infrastructure behind Amplification DDoS attacks

    Lurking Malice in the Cloud: Understanding and Detecting Cloud Repository as a Malicious Service